PREGO: A Literature and Data-Mining Resource to Associate Microorganisms, Biological Processes, and Environment Types

Haris Zafeiropoulos, Savvas Paragkamian, Stelios Ninidakis, Georgios A. Pavlopoulos, Lars Juhl Jensen, Evangelos Pafilis*

*Corresponding author af dette arbejde

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

14 Citationer (Scopus)
12 Downloads (Pure)

Abstract

To elucidate ecosystem functioning, it is fundamental to recognize what processes occur in which environments (where) and which microorganisms carry them out (who). Here, we present PREGO, a one-stop-shop knowledge base providing such associations. PREGO combines text mining and data integration techniques to mine such what-where-who associations from data and metadata scattered in the scientific literature and in public omics repositories. Microorganisms, biological processes, and environment types are identified and mapped to ontology terms from established community resources. Analyses of comentions in text and co-occurrences in metagenomics data/metadata are performed to extract associations and a level of confidence is assigned to each of them thanks to a scoring scheme. The PREGO knowledge base contains associations for 364,508 microbial taxa, 1090 environmental types, 15,091 biological processes, and 7971 molecular functions with a total of almost 58 million associations. These associations are available through a web portal, an Application Programming Interface (API), and bulk download. By exploring environments and/or processes associated with each other or with microbes, PREGO aims to assist researchers in design and interpretation of experiments and their results. To demonstrate PREGO’s capabilities, a thorough presentation of its web interface is given along with a meta-analysis of experimental results from a lagoon-sediment study of sulfur-cycle related microbes.

OriginalsprogEngelsk
Artikelnummer293
TidsskriftMicroorganisms
Vol/bind10
Udgave nummer2
Antal sider22
ISSN2076-2607
DOI
StatusUdgivet - 2022

Bibliografisk note

Funding Information:
This project was funded by the Hellenic Foundation for Research and Innovation (HFRI) & the General Secretariat for Research and Innovation (GSRI), under Grant No. 241, PREGO project. S.N. was supported by an EOSC-Life project (PID:14325). G.A.P. was supported by HFRI (1st call of research projects to support faculty members and researchers, Grant:1855-BOLOGNA) and the Marie Sk?odowska-Curie Individual Fellowships?MSCA-IF-EF-CAR (Grant ID: 838018-H2020-MSCA-IF-2018). L.J.J. was supported by the Novo Nordisk Foundation [NNF14CC0001].

Funding Information:
Acknowledgments: This research was supported in part through computational resources provided by IMBBC (Institute of Marine Biology, Biotechnology and Aquaculture) of the HCMR (Hellenic Centre for Marine Research). Funding for establishing the IMBBC HPC has been received by the MARBIGEN (EU Regpot) project, LifeWatchGreece RI, and the CMBR (Centre for the study and sustainable exploitation of Marine Biological Resources) RI. We would like to thank our colleagues, Lucia Fanini, Christina Pavloudi, Ioulia Santi, George Tsamis, and Miguel Desmarais, for their contribution, great discussions, and their feedback throughout the development of PREGO. We also thank Antonis Potirakis and Dimitris Sidirokastritis for their sysadmin server support. We thank Michail Kouratoras for the design of the PREGO logo. Last but not least, we thank Manolis Badouvas for his administrative feedback and support.

Funding Information:
Funding: This project was funded by the Hellenic Foundation for Research and Innovation (HFRI) & the General Secretariat for Research and Innovation (GSRI), under Grant No. 241, PREGO project. S.N. was supported by an EOSC-Life project (PID:14325). G.A.P. was supported by HFRI (1st call of research projects to support faculty members and researchers, Grant:1855-BOLOGNA) and the Marie Skłodowska-Curie Individual Fellowships—MSCA-IF-EF-CAR (Grant ID: 838018-H2020-MSCA-IF-2018). L.J.J. was supported by the Novo Nordisk Foundation [NNF14CC0001].

Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.

Citationsformater