Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 2010_07

Published June 15, 2010


UniProt and the International Nucleotide Sequence Database Collaboration

UniProt has had a very beneficial and long-standing collaboration with the three members of the International Nucleotide Sequence Database Collaboration (INSDC) – the EMBL-Bank, GenBank and the DNA Data Bank of Japan (DDBJ). It began at the most basic level with an exchange of nucleotide and protein sequences, evolved through co-development of the nucleotide entry feature table definition to ensure efficient automatic integration of appropriate protein information into UniProt followed by reciprocal cross-references, and from there has recently progressed to a joint endorsement of protein naming guidelines section. This was one outcome of the third NCBI Genome Annotation Workshop in Washington, USA in April 2010 where researchers from life science organizations world-wide collaborated to establish minimal standards for prokaryotic and viral annotation. Extremely productive discussions concerning annotation and underlying problems led to a number of resolutions that were adopted by the international microbial sequencing community. The highlight was the development and acceptance by the community of prokaryotic protein naming guidelines (see file proknameprot.txt) based on an initial proposal from the INSDC and UniProt. Following this agreement, INSDC and UniProt also created a more generalised protein guideline (see file gennameprot.txt) to make this useful for taxa outside cellular prokaryotes. The decision by the INSDC to provide these guidelines for adoption by all submitters to their databases will greatly enhance the annotation of complete genomes and proteomes and ensure that the user community can exploit this data to its full potential. This is a particularly timely and exciting development given the data avalanche. Future plans for the INSDC and UniProt involve collaboration with the NCBI’s Genome project and the Reference Sequence (RefSeq) collection groups to provide synchronized well-annotated genomes and proteomes.

The new files gennameprot.txt and proknameprot.txt are available in UniProt Documents, Nomenclature and guidelines section, and can be accessed from the Documentation/Help pages.

UniProtKB News

New feature key INTRAMEM in the flat file

In addition to the feature keys TOPO_DOM (which describes the topology of regions for transmembrane proteins that span membrane compartments) and TRANSMEM (which describes the extent of the region spanning a membrane), we have introduced a new feature key INTRAMEM in the flat file to describe the extent of a region located in a membrane without crossing it.

Cross-references to EnsemblBacteria, EnsemblFungi, EnsemblMetazoa, EnsemblPlants and EnsemblProtists

Cross-references have been added to Bacteria, EnsemblFungi, EnsemblMetazoa, EnsemblPlants and EnsemblProtists. These databases are part of Ensembl Genomes. Ensembl Genomes has been created to complement the existing Ensembl site, which focuses on vertebrate genomes.

The format of the explicit links in the flat file is:

Resource abbreviation EnsemblBacteria or EnsemblFungi or EnsemblMetazoa or
EnsemblPlants or EnsemblProtists
Resource identifier Transcript ID
Optional information 1 Protein ID
Optional information 2 Gene ID
Examples Q53653:
DR   EnsemblBacteria; EBSTAT00000032812; EBSTAP00000031682; EBSTAG00000032810.
DR   EnsemblFungi; YDR365W-B; YDR365W-B; YDR365W-B.
DR   EnsemblMetazoa; FBtr0071602; FBpp0071528; FBgn0020306.
DR   EnsemblMetazoa; FBtr0071603; FBpp0071529; FBgn0020306.
DR   EnsemblMetazoa; FBtr0071604; FBpp0071530; FBgn0020306.
DR   EnsemblPlants; AT1G66340.1-TAIR; AT1G66340.1-P; AT1G66340-TAIR-G.
DR   EnsemblProtists; DDB0305146; DDB0305146; DDB_G0286833.

Show all the entries having a cross-reference to EnsemblBacteria, EnsemblFungi, EnsemblMetazoa, EnsemblPlants or EnsemblProtists.

Changes concerning keywords

New keywords: