The Protein Information Resource (PIR), located at Georgetown University Medical Center, is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies. It contains protein sequences databases ^[1]^[2]^[3]^[4]^[5]^[6]^[7]

History

PIR was established in 1984 by the National Biomedical Research Foundation as a resource to assist researchers and customers in the identification and interpretation of protein sequence information. Prior to that, the foundation compiled the first comprehensive collection of macromolecular sequences in the Atlas of Protein Sequence and Structure, published from 1964 to 1974 under the editorship of Margaret Dayhoff. Dayhoff and her research group pioneered in the development of computer methods for the comparison of protein sequences, for the detection of distantly related sequences and duplications within sequences, and for the inference of evolutionary histories from alignments of protein sequences.^[8]

Winona Barker and Robert Ledley assumed leadership of the project after the death of Dayhoff in 1983. In 1999, Cathy H. Wu joined the National Biomedical Research Foundation, and later on Georgetown University Medical Center, to head the bioinformatics efforts of PIR, and has served first as Principal Investigator and, since 2001, as Director.^{[citation needed]}

For four decades, PIR has provided many protein databases and analysis tools freely accessible to the scientific community, including the Protein Sequence Database, the first international database (see PIR-International), which grew out of Atlas of Protein Sequences and Structure.^{[citation needed]}

In 2002, PIR – along with its international partners, the European Bioinformatics Institute and the Swiss Institute of Bioinformatics – were awarded a grant from NIH to create UniProt, a single worldwide database of protein sequence and function, by unifying the Protein Information Resource-Protein Sequence Database, Swiss-Prot, and TrEMBL databases. As of 2010^[update], PIR offers a wide variety of resources mainly oriented to assist the propagation and standardization of protein annotation: PIRSF,^[9] iProClass, and iProLINK.

The Protein Ontology is another popular database released by the Protein Information Resource.^[10]^[11]

References

^ http://pir.georgetown.edu/ Archived 2014-03-12 at the Wayback Machine Official website of PIR at Georgetown University.
^ Wu, Cathy; Nebert, Daniel W. (2004). "Update on genome completion and annotations: Protein Information Resource". Human Genomics. 1 (3): 229–33. doi:10.1186/1479-7364-1-3-229. PMC 3525084. PMID 15588483.
^ Wu, C. H. (2003). "The Protein Information Resource". Nucleic Acids Research. 31 (1): 345–347. doi:10.1093/nar/gkg040. PMC 165487. PMID 12520019.
^ Wu, CH; Huang, H; Arminski, L; Castro-Alvear, J; Chen, Y; Hu, ZZ; Ledley, RS; Lewis, KC; Mewes, HW; Orcutt, BC; Suzek, BE; Tsugita, A; Vinayaka, CR; Yeh, LS; Zhang, J; Barker, WC (2002-01-01). "The Protein Information Resource: an integrated public resource of functional annotation of proteins". Nucleic Acids Research. 30 (1): 35–37. doi:10.1093/nar/30.1.35. ISSN 1362-4962. PMC 99125. PMID 11752247.
^ Barker, W. C.; Garavelli, J. S.; Hou, Z.; Huang, H.; Ledley, R. S.; McGarvey, P. B.; Mewes, H. W.; Orcutt, B. C.; Pfeiffer, F.; Tsugita, A.; Vinayaka, C. R.; Xiao, C.; Yeh, L. S.; Wu, C. (2001). "Protein Information Resource: A community resource for expert annotation of protein data". Nucleic Acids Research. 29 (1): 29–32. doi:10.1093/nar/29.1.29. PMC 29802. PMID 11125041.
^ Barker, W. C. (2000). "The Protein Information Resource (PIR)". Nucleic Acids Research. 28 (1): 41–44. doi:10.1093/nar/28.1.41. PMC 102418. PMID 10592177.
^ George, D. G.; Dodson, R. J.; Garavelli, J. S.; Haft, D. H.; Hunt, L. T.; Marzec, C. R.; Orcutt, B. C.; Sidman, K. E.; Srinivasarao, G. Y.; Yeh, L.-S. L.; Arminski, L. M.; Ledley, R. S.; Tsugita, A.; Barker, W. C. (1997). "The Protein Information Resource (PIR) and the PIR-International Protein Sequence Database". Nucleic Acids Research. 25 (1): 24–27. doi:10.1093/nar/25.1.24. PMC 146415. PMID 9016497.
^ Izet, M (2016). "The Most Influential Scientists in the Development of Medical informatics (13): Margaret Belle Dayhoff". Acta Inform Med. 24 (4).
^ Wu, C. H.; Nikolskaya, A.; Huang, H.; Yeh, L. S.; Natale, D. A.; Vinayaka, C. R.; Hu, Z. Z.; Mazumder, R.; Kumar, S.; Kourtesis, P.; Ledley, R. S.; Suzek, B. E.; Arminski, L.; Chen, Y.; Zhang, J.; Cardenas, J. L.; Chung, S.; Castro-Alvear, J.; Dinkov, G.; Barker, W. C. (2004). "PIRSF: Family classification system at the Protein Information Resource". Nucleic Acids Research. 32 (90001): 112D–114. doi:10.1093/nar/gkh097. PMC 308831. PMID 14681371.
^ "GeorgeTown.edu - Protein Ontology". Archived from the original on 2011-03-10. Retrieved 2017-12-04.
^ Chicco, Davide; Masseroli, Marco (2019). "Biological and Medical Ontologies: Protein Ontology (PRO)". Encyclopedia of Bioinformatics and Computational Biology. pp. 832–837. doi:10.1016/B978-0-12-809633-8.20396-8. ISBN 9780128114322. S2CID 66974875.

v t e Bioinformatics
Databases	Sequence databases: GenBank, European Nucleotide Archive, DNA Data Bank of Japan and China National GeneBank Secondary databases: UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL and Protein Information Resource Other databases: BioNumbers, Protein Data Bank, Ensembl, InterPro, KEGG, and Gene Ontology Specialised genomic databases: BOLD, Saccharomyces Genome Database, FlyBase, VectorBase, WormBase, Rat Genome Database, PHI-base, Arabidopsis Information Resource, GISAID and Zebrafish Information Network
Software	BLAST Bowtie Clustal EMBOSS HMMER MUSCLE PANGOLIN SAMtools SOAP suite TopHat
Other	Server: ExPASy Rosalind (education platform)
Institutions	Broad Institute Computational Biology Department (CBD) Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Database Center for Life Science (DBCLS) DNA Data Bank of Japan (DDBJ) European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory (EMBL) Flatiron Institute J. Craig Venter Institute (JCVI) Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) US National Center for Biotechnology Information (NCBI) Japanese Institute of Genetics Netherlands Bioinformatics Centre (NBIC) Philippine Genome Center (PGC) Scripps Research Swiss Institute of Bioinformatics (SIB) Wellcome Sanger Institute Whitehead Institute
Organizations	African Society for Bioinformatics and Computational Biology (ASBCB) Australia Bioinformatics Resource (EMBL-AR) European Molecular Biology network (EMBnet) International Nucleotide Sequence Database Collaboration (INSDC) International Society for Biocuration (ISB) International Society for Computational Biology (ISCB) Student Council (ISCB-SC) Institute of Genomics and Integrative Biology (CSIR-IGIB) Japanese Society for Bioinformatics (JSBi)
Meetings	Basel Computational Biology Conference‎ ([BC²]) European Conference on Computational Biology (ECCB) Intelligent Systems for Molecular Biology (ISMB) International Conference on Bioinformatics (InCoB) International Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB) ISCB Africa ASBCB Conference on Bioinformatics Pacific Symposium on Biocomputing (PSB) Research in Computational Molecular Biology (RECOMB)
File formats	CRAM format FASTA format FASTQ format NeXML format Nexus format Pileup format SAM format Stockholm format VCF format GFF format GTF format
Related topics	Computational biology List of biobanks List of biological databases Molecular phylogenetics Sequencing Sequence database Sequence alignment
Category Commons