TY - JOUR
T1 - The STRING database in 2021
T2 - customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets
AU - Szklarczyk, Damian
AU - Gable, Annika L.
AU - Nastou, Katerina C.
AU - Lyon, David
AU - Kirsch, Rebecca
AU - Pyysalo, Sampo
AU - Doncheva, Nadezhda T.
AU - Legeay, Marc
AU - Fang, Tao
AU - Bork, Peer
AU - Jensen, Lars J.
AU - von Mering, Christian
PY - 2021
Y1 - 2021
N2 - Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein-protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data.
AB - Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein-protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data.
KW - PREDICTION
KW - INTEGRATION
KW - DISEASE
KW - IDENTIFICATION
KW - DISCOVERY
KW - BIOLOGY
U2 - 10.1093/nar/gkaa1074
DO - 10.1093/nar/gkaa1074
M3 - Journal article
C2 - 33237311
VL - 49
SP - D605-D612
JO - Nucleic acids symposium series
JF - Nucleic acids symposium series
SN - 0261-3166
IS - D1
ER -