TY - JOUR
T1 - Extending the Latent Dirichlet Allocation model to presence/absence data
T2 - A case study on North American breeding birds and biogeographical shifts expected from climate change
AU - Valle, Denis
AU - Albuquerque, Pedro
AU - Zhao, Qing
AU - Barberan, Albert
AU - Fletcher, Robert J.
N1 - Funding Information:
We thank the numerous comments provided by Ben Baiser, Daijiang Li, Gordon Burleigh, Tamer Kahveci, Rasha Assad, Joshua Ladau, Ermias Azeria, and Fred Johnson. This work was partly supported by the US Department of Agriculture National Institute of Food and Agriculture McIntire?Stennis project 1005163 and by the US National Science Foundation award 1458034 to DV. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Funding Information:
US Department of Agriculture National Institute of Food and Agriculture, Grant/ Award Number: 1005163; National Science Foundation, Grant/Award Number: 1458034
Funding Information:
We thank the numerous comments provided by Ben Baiser, Daijiang Li, Gordon Burleigh, Tamer Kahveci, Rasha Assad, Joshua Ladau, Ermias Azeria, and Fred Johnson. This work was partly supported by the US Department of Agriculture National Institute of Food and Agriculture McIntire–Stennis project 1005163 and by the US National Science Foundation award 1458034 to DV. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2018 John Wiley & Sons Ltd
PY - 2018/11
Y1 - 2018/11
N2 - Understanding how species composition varies across space and time is fundamental to ecology. While multiple methods having been created to characterize this variation through the identification of groups of species that tend to co-occur, most of these methods unfortunately are not able to represent gradual variation in species composition. The Latent Dirichlet Allocation (LDA) model is a mixed-membership method that can represent gradual changes in community structure by delineating overlapping groups of species, but its use has been limited because it requires abundance data and requires users to a priori set the number of groups. We substantially extend LDA to accommodate widely available presence/absence data and to simultaneously determine the optimal number of groups. Using simulated data, we show that this model is able to accurately determine the true number of groups, estimate the underlying parameters, and fit with the data. We illustrate this method with data from the North American Breeding Bird Survey (BBS). Overall, our model identified 18 main bird groups, revealing striking spatial patterns for each group, many of which were closely associated with temperature and precipitation gradients. Furthermore, by comparing the estimated proportion of each group for two time periods (1997–2002 and 2010–2015), our results indicate that nine (of 18) breeding bird groups exhibited an expansion northward and contraction southward of their ranges, revealing subtle but important community-level biodiversity changes at a continental scale that are consistent with those expected under climate change. Our proposed method is likely to find multiple uses in ecology, being a valuable addition to the toolkit of ecologists.
AB - Understanding how species composition varies across space and time is fundamental to ecology. While multiple methods having been created to characterize this variation through the identification of groups of species that tend to co-occur, most of these methods unfortunately are not able to represent gradual variation in species composition. The Latent Dirichlet Allocation (LDA) model is a mixed-membership method that can represent gradual changes in community structure by delineating overlapping groups of species, but its use has been limited because it requires abundance data and requires users to a priori set the number of groups. We substantially extend LDA to accommodate widely available presence/absence data and to simultaneously determine the optimal number of groups. Using simulated data, we show that this model is able to accurately determine the true number of groups, estimate the underlying parameters, and fit with the data. We illustrate this method with data from the North American Breeding Bird Survey (BBS). Overall, our model identified 18 main bird groups, revealing striking spatial patterns for each group, many of which were closely associated with temperature and precipitation gradients. Furthermore, by comparing the estimated proportion of each group for two time periods (1997–2002 and 2010–2015), our results indicate that nine (of 18) breeding bird groups exhibited an expansion northward and contraction southward of their ranges, revealing subtle but important community-level biodiversity changes at a continental scale that are consistent with those expected under climate change. Our proposed method is likely to find multiple uses in ecology, being a valuable addition to the toolkit of ecologists.
KW - biodiversity
KW - breeding bird groups
KW - climate change
KW - cluster analysis
KW - community ecology
KW - mixed-membership model
KW - multivariate statistics
KW - presence/absence data
UR - http://www.scopus.com/inward/record.url?scp=85053214865&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053214865&partnerID=8YFLogxK
U2 - 10.1111/gcb.14412
DO - 10.1111/gcb.14412
M3 - Article
C2 - 30058746
AN - SCOPUS:85053214865
VL - 24
SP - 5560
EP - 5572
JO - Global Change Biology
JF - Global Change Biology
SN - 1354-1013
IS - 11
ER -