Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird

Comparing discriminant analysis, neural networks and logistic regression for predicting species... We assessed the occurrence of a common river bird, the Plumbeous Redstart Rhyacornis fuliginosus, along 180 independent streams in the Indian and Nepali Himalaya. We then compared the performance of multiple discrimant analysis (MDA), logistic regression (LR) and artificial neural networks (ANN) in predicting this species’ presence or absence from 32 variables describing stream altitude, slope, habitat structure, chemistry and invertebrate abundance. Using the entire data (=training set) and a threshold for accepting presence in ANN and LR set to P ≥0.5, ANN correctly classified marginally more cases (88%) than either LR (83%) or MDA (84%). Model performance was assessed from two methods of data partitioning. In a ‘leave-one-out’ approach, LR correctly predicted more cases (82%) than MDA (73%) or ANN (69%). However, in a holdout procedure, all the methods performed similarly (73–75%). All methods predicted true absence (i.e. specificity in holdout: 81–85%) better than true presence (i.e. sensitivity: 57–60%). These effects reflect species’ prevalence (=frequency of occurrence), but are seldom considered in distribution modelling. Despite occurring at only 36% of the sites, Plumbeous Redstarts are one of the most common Himalayan river birds, and problems will be greater with less common species. Both LR and ANN require an arbitrary threshold probability (often P =0.5) at which to accept species presence from model prediction. Simulations involving varied prevalence revealed that LR was particularly sensitive to threshold effects. ROC plots (received operating characteristic) were therefore used to compare model performance on test data at a range of thresholds; LR always outperformed ANN. This case study supports the need to test species’ distribution models with independent data, and to use a range of criteria in assessing model performance. ANN do not yet have major advantages over conventional multivariate methods for assessing bird distributions. LR and MDA were both more efficient in the use of computer time than ANN, and also more straightforward in providing testable hypotheses about environmental effects on occurrence. However, LR was apparently subject to chance significant effects from explanatory variables, emphasising the well-known risks of models based purely on correlative data. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Ecological Modelling Elsevier

Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird

Loading next page...
 
/lp/elsevier/comparing-discriminant-analysis-neural-networks-and-logistic-WYtY4z4kVv
Publisher site
See Article on Publisher Site

Abstract

We assessed the occurrence of a common river bird, the Plumbeous Redstart Rhyacornis fuliginosus, along 180 independent streams in the Indian and Nepali Himalaya. We then compared the performance of multiple discrimant analysis (MDA), logistic regression (LR) and artificial neural networks (ANN) in predicting this species’ presence or absence from 32 variables describing stream altitude, slope, habitat structure, chemistry and invertebrate abundance. Using the entire data (=training set) and a threshold for accepting presence in ANN and LR set to P ≥0.5, ANN correctly classified marginally more cases (88%) than either LR (83%) or MDA (84%). Model performance was assessed from two methods of data partitioning. In a ‘leave-one-out’ approach, LR correctly predicted more cases (82%) than MDA (73%) or ANN (69%). However, in a holdout procedure, all the methods performed similarly (73–75%). All methods predicted true absence (i.e. specificity in holdout: 81–85%) better than true presence (i.e. sensitivity: 57–60%). These effects reflect species’ prevalence (=frequency of occurrence), but are seldom considered in distribution modelling. Despite occurring at only 36% of the sites, Plumbeous Redstarts are one of the most common Himalayan river birds, and problems will be greater with less common species. Both LR and ANN require an arbitrary threshold probability (often P =0.5) at which to accept species presence from model prediction. Simulations involving varied prevalence revealed that LR was particularly sensitive to threshold effects. ROC plots (received operating characteristic) were therefore used to compare model performance on test data at a range of thresholds; LR always outperformed ANN. This case study supports the need to test species’ distribution models with independent data, and to use a range of criteria in assessing model performance. ANN do not yet have major advantages over conventional multivariate methods for assessing bird distributions. LR and MDA were both more efficient in the use of computer time than ANN, and also more straightforward in providing testable hypotheses about environmental effects on occurrence. However, LR was apparently subject to chance significant effects from explanatory variables, emphasising the well-known risks of models based purely on correlative data.

Journal

Ecological ModellingElsevier

Published: Aug 17, 1999

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create folders to
organize your research

Export folders, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off