Biological databases are stores of biological information.[1] The journal Nucleic Acids Research regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases and updates to previously described databases.[2] Omics Discovery Index can be used to browse and search several biological databases.
Meta databases are databases of databases that collect data about data to generate new data. They are capable of merging information from different sources and making it available in a new and more convenient form, or with an emphasis on a particular disease or organism.[metadatabase is a database model for metadata management, global query of independent database, and distributed data processing. The word metadatabase is an addition to the dictionary]. originally ,metadata was only common term referring simply to data about data such a tags ,keywords, and markup headers.
Model organism databases provide in-depth biological data for intensively studied organisms.
For DNA barcoding databases with a taxonomic focus, see #Taxonomic databases. |
The primary databases make up the International Nucleotide Sequence Database (INSD). The include:
DDBJ (Japan), GenBank (USA) and European Nucleotide Archive (Europe) are repositories for nucleotide sequence data from all organisms. All three accept nucleotide sequence submissions, and then exchange new and updated data on a daily basis to achieve optimal synchronisation between them. These three databases are primary databases, as they house original sequence data. They collaborate with Sequence Read Archive (SRA), which archives raw reads from high-throughput sequencing instruments.
Secondary databases are:[clarification needed]
Other databases
Main article: Microarray databases |
These databases collect genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species genomes, or a single model organism genome.
For rRNA databases with a taxonomic focus, see #Taxonomic databases. |
Several publicly available data repositories and resources have been developed to support and manage protein related information, biological knowledge discovery and data-driven hypothesis generation.[13] The databases in the table below are selected from the databases listed in the Nucleic Acids Research (NAR) databases issues and database collection and the databases cross-referenced in the UniProtKB. Most of these databases are cross-referenced with UniProt / UniProtKB so that identifiers can be mapped to each other.[13]
Database Short Name | Database Name |
---|---|
CCDS | The Consensus CDS protein set database |
DDBJ | DNA Data Bank of Japan |
ENA | European Nucleotide Archive |
GenBank | GenBank nucleotide sequence database |
Refseq | NCBI Reference Sequence Database |
UniGene | Database of computationally identifies transcripts from the same locus |
UniProtKB | Universal Protein Resource (UniProt) |
Database Short Name | Database Name |
---|---|
DisProt | Database of Protein Disorder |
MobiDB | Database of intrinsically disordered and mobile proteins |
ModBase | Database of Comparative Protein Structure Models |
PDBsum | Pictorial database of 3D structures in the Protein Data Bank |
ProteinModelPortal | Protein Model Portal of the PSI-Nature Structural Biology Knowledgebase |
SMR | Database of annotated 3D protein structure models |
For more protein structure databases, see also Protein structure database.
Main article: List of biodiversity databases |
Numerous databases collect information about species and other taxonomic categories. The Catalogue of Life is a special case as it is a meta-database of about 150 specialized "global species databases" (GSDs) that have collected the names and other information on (almost) all described and thus "known" species.
Images play a critical role in biomedicine, ranging from images of anthropological specimens to zoology. However, there are relatively few databases dedicated to image collection, although some projects such as iNaturalist collect photos as a main part of their data. A special case of "images" are 3-dimensional images such as protein structures or 3D-reconstructions of anatomical structures. Image databases include, among others:[18]