Access the full text.
Sign up today, get DeepDyve free for 14 days.
G. Imbens, T. Lancaster (1996)
Case-control studies with contaminated controls☆Journal of Econometrics, 71
J. Busby (1991)
BIOCLIM - a bioclimate analysis and prediction systemPlant protection quarterly, 6
Steven Phillips, Miroslav Dudík (2008)
Modeling of species distributions with Maxent: new extensions and a comprehensive evaluationEcography, 31
Steven Phillips, R. Anderson, R. Schapire (2006)
Maximum entropy modeling of species geographic distributionsEcological Modelling, 190
A. Guisan, N. Zimmermann, J. Elith, C. Graham, S. Phillips, A. Peterson (2007)
What matters for predicting spatial distributions of tree occurrences: techniques, data, or species' characteristics., 77
J. Heckman (1979)
Sample selection bias as a specification errorApplied Econometrics, 31
E. Jaynes (1957)
Information Theory and Statistical MechanicsPhysical Review, 106
J. Friedman (1990)
Multivariate adaptive regression splines
A. Guisan, N. Zimmermann, J. Elith, C. Graham, Steven Phillips, A. Peterson (2007)
WHAT MATTERS FOR PREDICTING THE OCCURRENCES OF TREES: TECHNIQUES, DATA, OR SPECIES' CHARACTERISTICS?Ecological Monographs, 77
E. Wiley, K. McNyset, A. Peterson, C. Robins, Aimee Stewart (2003)
Niche Modeling and Geographic Range Predictions in the Marine Environment Using a Machine-learning Algorithm
C. Graham, C. Graham, S. Ferrier, Falk Huettman, C. Moritz, A. Peterson (2004)
New developments in museum-based informatics and applications in biodiversity analysis.Trends in ecology & evolution, 19 9
W. Ponder, Gregory Carter, P. Flemons, R. Chapman (2001)
Evaluation of Museum Collection Data for Use in Biodiversity AssessmentConservation Biology, 15
C. Thomas, A. Cameron, R. Green, R. Green, M. Bakkenes, L. Beaumont, Yvonne Collingham, B. Erasmus, Marinez Siqueira, A. Grainger, L. Hannah, L. Hughes, B. Huntley, A. Jaarsveld, G. Midgley, L. Miles, L. Miles, M. Ortega-Huerta, A. Peterson, O. Phillips, S. Williams (2004)
Extinction risk from climate changeNature, 427
B. Manly, L. McDonald, D. Thomas, T. McDonald, W. Erickson (2002)
Resource selection by animals: statistical design and analysis for field studies. Second edition
T. Yee, N. Mitchell (1991)
Generalized additive models in plant ecologyJournal of Vegetation Science, 2
J. Grego (2006)
Generalized Additive Models
(196)
Ecological Applications
A. Zaniewski, A. Lehmann, J. Overton (2002)
Predicting species spatial distributions using presence-only data: a case study of native New Zealand fernsEcological Modelling, 157
J. Leathwick, J. Elith, M. Francis, T. Hastie, P. Taylor (2006)
Variation in demersal fish species richness in the oceans surrounding New Zealand: an analysis using boosted regression treesMarine Ecology Progress Series, 321
Robin Engler, A. Guisan, Luca Rechsteiner (2004)
An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence dataJournal of Applied Ecology, 41
R. Anderson (2003)
Real vs. artefactual absences in species distributions: tests for Oryzomys albigularis (Rodentia: Muridae) in VenezuelaJournal of Biogeography, 30
Bette Loiselle, Christine Howell, C. Graham, Jaqueline Goerck, T. Brooks, Kimberly Smith, P. Williams (2003)
Avoiding Pitfalls of Using Species Distribution Models in Conservation PlanningConservation Biology, 17
E. Wiley, K. McNyset, A. Peterson, C. Robins, Aimee Stewart (2003)
Niche Modeling Perspective on Geographic Range Predictions in the Marine Environment Using a Machine-learning AlgorithmOceanography, 16
(2006)
The settings used for BRT have been improved over those used previously and we use a recent version of Maxent (version 3.0) with default settings
L. Schulman, T. Toivonen, K. Ruokolainen (2007)
Analysing botanical collecting effort in Amazonia and correcting for it in species range estimationJournal of Biogeography, 34
A. Fielding, J. Bell (1997)
A review of methods for the assessment of prediction errors in conservation presence/absence modelsEnvironmental Conservation, 24
Jorge Argáeź, J. Christen, Miguel Nakamura, Jorge Soberón (2005)
Prediction of potential areas of species distributions based on presence-only dataEnvironmental and Ecological Statistics, 12
Sushma Reddy, L. Dávalos (2003)
Geographical sampling bias and its implications for conservation priorities in AfricaJournal of Biogeography, 30
(2003)
Stockwell and Peterson’s genetic algorithm for rule-set prediction (‘‘GARP’’; Stockwell and Peters 1999, Peterson and Kluza 2003). More generally, a broad range of logistic regression methods
J. Friedman (2001)
Greedy function approximation: A gradient boosting machine.Annals of Statistics, 29
C. Margules, M. Austin (1990)
Nature Conservation: Cost Effective Biological Surveys and Data Analysis
J. Huang, A. J. Smola, A. Gretton, K. M. Borgwardt (2007)
Advances in neural information processing systems 19
David Stockwell (1999)
The GARP modelling system: problems and solutions to automated spatial predictionInt. J. Geogr. Inf. Sci., 13
A. Peterson, Jorge Soberón, V. Sánchez‐Cordero (1999)
Conservatism of ecological niches in evolutionary timeScience, 285 5431
(1991)
presence-only modeling methods fell into three distinct groups. The lower group consisted largely of methods that do not use background data, such as BIOCLIM (Busby
press. Presence-only data and the EM algorithm
A. Hirzel, J. Hausser, D. Chessel, N. Perrin (2002)
ECOLOGICAL-NICHE FACTOR ANALYSIS: HOW TO COMPUTE HABITAT-SUITABILITY MAPS WITHOUT ABSENCE DATA?Ecology, 83
Miroslav Dudík, R. Schapire, Steven Phillips (2005)
Correcting sample selection bias in maximum entropy density estimation
K. Keating, S. Cherry (2004)
USE AND INTERPRETATION OF LOGISTIC REGRESSION IN HABITAT-SELECTION STUDIES, 68
S. Ferrier, M. Drielsma, G. Manion, G. Watson (2002)
Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. II. Community-level modellingBiodiversity & Conservation, 11
K. Kozak, C. Graham, J. Wiens (2008)
Integrating GIS-based environmental data into evolutionary biology.Trends in ecology & evolution, 23 3
B. Manly, L. McDonald, Dana Thomas (1994)
Resource selection by animals: statistical design and analysis for field studies.
J. Elith, J. Leathwick (2007)
Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splinesDiversity and Distributions, 13
(2007)
In other words, the output of FactorBiasOut converges to a distribution that is close, in a strict sense and as in the unbiased case, to the true distribution
P. Hernández, C. Graham, L. Master, Deborah Albert (2006)
The effect of sample size and species characteristics on performance of different species distribution modeling methodsEcography, 29
M. Boyce, Pierre Vernier, S. Nielsen, F. Schmiegelow (2002)
Evaluating resource selection functionsEcological Modelling, 157
Townsend Peterson, Daniel Kluza (2003)
New distributional modelling approaches for gap analysisAnimal Conservation, 6
Roger Dennis, Chris Thomas (2000)
Bias in Butterfly Distribution Maps: The Influence of Hot Spots and Recorder's Home RangeJournal of Insect Conservation, 4
G. De’ath (2007)
Boosted trees for ecological modeling and prediction.Ecology, 88 1
J. Elith, Catherine Graham, Robert Anderson, Miroslav Dudı́k, Simon Ferrier, A. Guisan, R. Hijmans, F. Huettmann, J. Leathwick, Anthony Lehmann, Jin Li, Lúcia Lohmann, Bette Loiselle, G. Manion, Craig Moritz, Miguel Nakamura, Yoshinori Nakazawa, J. Overton, A. Peterson, Steven Phillips, Karen Richardson, R. Scachetti-Pereira, R. Schapire, Jorge Soberón, Stephen Williams, M. Wisz, N. Zimmermann (2006)
Novel methods improve prediction of species' distributions from occurrence dataEcography, 29
Jiayuan Huang, Alex Smola, A. Gretton, K. Borgwardt, B. Scholkopf (2006)
Correcting Sample Selection Bias by Unlabeled Data
(2005)
To obtain the best test results, we would like the Maxent distribution to approximate p with respect to the distribution of test
J. Friedman (1991)
Multivariate adaptive regression splines (with discussion)., 19
B. Zadrozny (2004)
Proceedings of the Twenty-First International Conference on Machine Learning
W. Thuiller, D. Richardson, P. Pyšek, G. Midgley, G. Hughes, M. Rouget (2005)
Niche‐based modelling as a tool for predicting the risk of alien plant invasions at a global scaleGlobal Change Biology, 11
S. Ferrier, G. Watson, J. Pearce, M. Drielsma (2004)
Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. I. Species-level modellingBiodiversity & Conservation, 11
B. Zadrozny (2004)
Learning and evaluating classifiers under sample selection biasProceedings of the twenty-first international conference on Machine learning
(2002)
PUA(s 1⁄4 1 j x) can thus be used to infer parameters of an exponential model for P(y 1⁄4 1 j x) (Boyce et
(1999)
A species may have failed to disperse due to geographic barriers, or be excluded from an area due to competition
A. Suarez, N. Tsutsui (2004)
The Value of Museum Collections for Research and Society, 54
Miroslav Dudík, Steven Phillips, R. Schapire (2007)
Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution ModelingJ. Mach. Learn. Res., 8
Richard Brewer (2009)
Atlas of the Breeding Birds of Ontario, 2001–2005, 126
Beiyao Zheng, A. Agresti (2000)
Summarizing the predictive power of a generalized linear model.Statistics in medicine, 19 13
G. Carpenter, A. Gillison, J. Winter (1993)
DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animalsBiodiversity & Conservation, 2
(2006)
De’ath 2007), maximum entropy (Maxent; Phillips et al. 2006), multivariate adaptive regression splines (MARS; Leathwick et al. 2005), and generalized additive models (GAM
E. Cawsey, Mike Austin, B. Baker (2002)
Regional vegetation mapping in Australia: a case study in the practical use of statistical modellingBiodiversity & Conservation, 11
J. Leathwick, D. Rowe, J. Richardson, J. Elith, T. Hastie (2005)
Using multivariate adaptive regression splines to predict the distributions of New Zealand ’ s freshwater diadromous fish
G. Ward, T. Hastie, S. Barry, J. Elith, J. Leathwick
Presence-only data and the EM algorithm.
A. Gelfand, J. Silander, Shanshan Wu, A. Latimer, P. Lewis, A. Rebelo, M. Holder (2006)
Explaining Species Distribution Patterns through Hierarchical ModelingBayesian Analysis, 1
(2000)
These occurrence data often exhibit strong spatial bias in survey effort (Dennis and Thomas
M. Lütolf, F. Kienast, A. Guisan (2006)
The ghost of past species occurrence: improving species distribution models for presence-only dataJournal of Applied Ecology, 43
Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.
Ecological Applications – Ecological Society of America
Published: Jan 1, 2009
Keywords: background data ; presence-only distribution models ; niche modeling ; pseudo-absence ; sample selection bias ; species distribution modeling ; target group
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.