Journal cover Journal topic
Web Ecology An open-access peer-reviewed journal
Journal topic

Journal metrics

IF value: 1.545
IF 5-year value: 1.373
IF 5-year
CiteScore value: 2.4
SNIP value: 0.830
IPP value: 1,52
SJR value: 0.452
Scimago H <br class='widget-line-break'>index value: 17
Scimago H
Volume 13, issue 1
Web Ecol., 13, 13–19, 2013
© Author(s) 2013. This work is distributed under
the Creative Commons Attribution 3.0 License.
Web Ecol., 13, 13–19, 2013
© Author(s) 2013. This work is distributed under
the Creative Commons Attribution 3.0 License.

Standard article 13 May 2013

Standard article | 13 May 2013

Prevalence, statistical thresholds, and accuracy assessment for species distribution models

B. B. Hanberry and H. S. He B. B. Hanberry and H. S. He
  • School of Natural Resources, Univ. of Missouri, 203 Natural Resources Building, Columbia, MO 65211, USA

Abstract. For species distribution models, species frequency is termed prevalence and prevalence in samples should be similar to natural species prevalence, for unbiased samples. However, modelers commonly adjust sampling prevalence, producing a modeling prevalence that has a different frequency of occurrences than sampling prevalence. The separate effects of (1) use of sampling prevalence compared to adjusted modeling prevalence and (2) modifications necessary in thresholds, which convert continuous probabilities to discrete presence or absence predictions, to account for prevalence, are unresolved issues. We examined effects of prevalence and thresholds and two types of pseudoabsences on model accuracy. Use of sampling prevalence produced similar models compared to use of adjusted modeling prevalences. Mean correlation between predicted probabilities of the least (0.33) and greatest modeling prevalence (0.83) was 0.86. Mean predicted probability values increased with increasing prevalence; therefore, unlike constant thresholds, varying threshold to match prevalence values was effective in holding true positive rate, true negative rate, and species prediction areas relatively constant for every modeling prevalence. The area under the curve (AUC) values appeared to be as informative as sensitivity and specificity, when using surveyed pseudoabsences as absent cases, but when the entire study area was coded, AUC values reflected the area of predicted presence as absent. Less frequent species had greater AUC values when pseudoabsences represented the study background. Modeling prevalence had a mild impact on species distribution models and accuracy assessment metrics when threshold varied with prevalence. Misinterpretation of AUC values is possible when AUC values are based on background absences, which correlate with frequency of species.

Publications Copernicus