STRING


In molecular biology, STRING is a biological database and web resource of known and predicted protein–protein interactions.
The STRING database contains information from numerous sources, including experimental data, computational prediction methods and public text collections. It is freely accessible and it is regularly updated. The resource also serves to highlight functional enrichments in user-provided lists of proteins, using a number of functional classification systems such as GO, Pfam and KEGG. The latest version 10.5 contains information on about 9.6 million proteins from more than 2000 organisms. STRING has been developed by a consortium of academic institutions including CPR, EMBL, KU, SIB, TUD and UZH.

Usage

Protein–protein interaction networks are an important ingredient for the system-level understanding of cellular processes.
Such networks can be used for filtering and assessing functional genomics data and for providing an intuitive platform for annotating structural, functional and evolutionary properties of proteins.
Exploring the predicted interaction networks can suggest new directions for future experimental research and provide cross-species predictions for efficient interaction mapping.

Features

The data is weighted and integrated and a confidence score is calculated for all protein interactions. Results of the various computational predictions can be inspected from different designated views. There are two modes of STRING: Protein-mode and -mode. Predicted interactions are propagated to proteins in other organisms for which interaction has been described by inference of orthology. A web interface is available to access the data and to give a fast overview of the proteins and their interactions. A plug-in for cytoscape to use STRING data is available.
Another possibility to access data STRING is to use the application programming interface by constructing a URL that contain the request.

Data sources

Like many other databases that store protein association knowledge, STRING imports data from experimentally derived protein–protein interactions through literature curation. Furthermore, STRING also store computationally predicted interactions from: text mining of scientific texts, interactions computed from genomic features, and
interactions transferred from model organisms based on orthology.
All predicted or imported interactions are benchmarked against a common reference of functional partnership as annotated by KEGG.

Imported data

STRING imports protein association knowledge from databases of physical interaction and databases of curated biological pathway knowledge
.
Links are supplied to the originating data of the respective experimental repositories and database resources.

Text mining

A large body of scientific texts are parsed to search for statistically relevant co-occurrences of gene names.

Predicted data