occCite: Tools for querying and managing large biodiversity occurrence datasets

Hannah L. Owens*, Cory Merow, Brian S. Maitner, Jamie M. Kass, Vijay Barve, Robert P. Guralnick

*Corresponding author af dette arbejde

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

5 Citationer (Scopus)
23 Downloads (Pure)

Abstract

The amount of observational and specimen-based biodiversity data available to researchers is increasing exponentially, yet the ability to manage and cite large, complex biodiversity datasets lags behind. This management and citation gap impedes reproducibility for data users and the ability for data publishers to track use and accumulate use citations, ultimately harming the longer-term sustainability of the still-emerging enterprise of research data-sharing. Here we present an R package, occCite (v. 0.4.7), to aid researchers in querying large species occurrence data aggregators (specifically, the Global Biodiversity Information Facility, GBIF, and the Botanical Information and Ecology Network, BIEN), and store metadata such as primary data providers, database accession dates, DOIs, and the taxonomic source used for search terms. occCite also includes tools to summarize and visualize query results and generate citation lists of all data providers and software packages used during the query process. We provide examples of a basic occurrence search and citation workflow as well as an advanced workflow using features for custom optimized searches, visualization, and summary procedures. occCite improves upon existing R packages by uniting data from powerful API-based query packages (rgbif and BIEN) into a unified object-based framework, while maintaining metadata vital to best-practice recommendations for documenting biodiversity analysis workflows. occCite aims to efficiently close the gap in the citation cycle between primary data providers and final research products, allowing researchers to meet dataset documentation standards without sacrificing time and resources to the demands of providing increasing levels of detail on their datasets.

OriginalsprogEngelsk
TidsskriftEcography
Vol/bind44
Udgave nummer8
Sider (fra-til)1228-1235
Antal sider8
ISSN0906-7590
DOI
StatusUdgivet - 2021

Bibliografisk note

Funding Information:
– Funding for this project was provided by a seed grant from the University of Florida Biodiversity and Informatics Institutes and a second place Ebbe Nielsen Challenge prize from the Global Biodiversity Information Facility. CM acknowledges funding from NSF grant DBI‐1913673 and DBI‐1661510. Funding

Publisher Copyright:
© 2021 The Authors. Ecography published by John Wiley & Sons Ltd on behalf of Nordic Society Oikos

Citationsformater