A review of the heterogeneous landscape of biodiversity databases: Opportunities and challenges for a synthesized biodiversity knowledge base

Xiao Feng, Brian J. Enquist, Daniel S. Park, Brad Boyle, David D. Breshears, Rachael V. Gallagher, Aaron Lien, Erica A. Newman, Joseph R. Burger, Brian S. Maitner, Cory Merow, Yaoqi Li, Kimberly M. Huynh, Kacey Ernst, Elizabeth Baldwin, Wendy Foden, Lee Hannah, Peter M. Jørgensen, Nathan J.B. Kraft, Jon C. LovettPablo A. Marquet, Brian J. McGill, Naia Morueta-Holme, Danilo M. Neves, Mauricio M. Núñez-Regueiro, Ary T. Oliveira-Filho, Robert K. Peet, Michiel Pillet, Patrick R. Roehrdanz, Brody Sandel, Josep M. Serra-Diaz, Irena Šímová, Jens Christian Svenning, Cyrille Violle, Trang D. Weitemier, Susan Wiser, Laura López-Hoffman

Research output: Contribution to journalReview articlepeer-review

26 Scopus citations


Aim: Addressing global environmental challenges requires access to biodiversity data across wide spatial, temporal and taxonomic scales. Availability of such data has increased exponentially recently with the proliferation of biodiversity databases. However, heterogeneous coverage, protocols, and standards have hampered integration among these databases. To stimulate the next stage of data integration, here we present a synthesis of major databases, and investigate (a) how the coverage of databases varies across taxonomy, space, and record type; (b) what degree of integration is present among databases; (c) how integration of databases can increase biodiversity knowledge; and (d) the barriers to database integration. Location: Global. Time period: Contemporary. Major taxa studied: Plants and vertebrates. Methods: We reviewed 12 established biodiversity databases that mainly focus on geographic distributions and functional traits at global scale. We synthesized information from these databases to assess the status of their integration and major knowledge gaps and barriers to full integration. We estimated how improved integration can increase the data coverage for terrestrial plants and vertebrates. Results: Every database reviewed had a unique focus of data coverage. Exchanges of biodiversity information were common among databases, although not always clearly documented. Functional trait databases were more isolated than those pertaining to species distributions. Variation and potential incompatibility of taxonomic systems used by different databases posed a major barrier to data integration. We found that integration of distribution databases could lead to increased taxonomic coverage that corresponds to 23 years’ advancement in data accumulation, and improvement in taxonomic coverage could be as high as 22.4% for trait databases. Main conclusions: Rapid increases in biodiversity knowledge can be achieved through the integration of databases, providing the data necessary to address critical environmental challenges. Full integration across databases will require tackling the major impediments to data integration: taxonomic incompatibility, lags in data exchange, barriers to effective data synchronization, and isolation of individual initiatives.

Original languageEnglish (US)
Pages (from-to)1242-1260
Number of pages19
JournalGlobal Ecology and Biogeography
Issue number7
StatePublished - Jul 2022


  • big data
  • biodiversity informatics
  • biogeography
  • database integration
  • functional trait
  • taxonomic system

ASJC Scopus subject areas

  • Global and Planetary Change
  • Ecology, Evolution, Behavior and Systematics
  • Ecology


Dive into the research topics of 'A review of the heterogeneous landscape of biodiversity databases: Opportunities and challenges for a synthesized biodiversity knowledge base'. Together they form a unique fingerprint.

Cite this