Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

UniProt release 2018_11

Published December 5, 2018

Headline

Enhanced enzyme annotation in UniProtKB using Rhea

This release marks a major advance in the way UniProt describes enzyme function, with the introduction of Rhea as a vocabulary to annotate and represent enzyme-catalysed reactions in UniProtKB.

Rhea is a comprehensive expert-curated knowledgebase of biochemical reactions that uses the ChEBI (Chemical Entities of Biological Interest) ontology to describe reaction participants, their chemical structures, and chemical transformations. Rhea provides stable unique identifiers for reactions and standard computationally tractable descriptors for chemical transformations.

The enhanced enzyme annotations created using Rhea will form the basis of new search and identifier mapping services in UniProtKB that combine knowledge of small molecules and proteins. They will help UniProt users to more easily integrate and analyse metabolomic data, annotate metabolic networks and models, or mine reaction data to study enzyme evolution and predict new pathways for drug production or bioremediation.

Recent publications provide additional information on Rhea reactions and examples of services that integrate Rhea with biological knowledge from UniProtKB; we hope these will inspire you to dig deeper into the wealth of enzyme data in UniProtKB.

For further technical details about this change see below.

UniProtKB news

Standardization of ‘Catalytic activity’ annotations

A ‘Catalytic activity’ annotation describes a catalytic activity of an enzyme, i.e. a chemical reaction that the enzyme catalyzes. Up to now, UniProt has followed the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) for the description of enzymatic activities, except for reactions that are described in the scientific literature, but that are not (yet) covered by the NC-IUBMB. The focus of the NC-IUBMB is the nomenclature and classification of enzymes by the reactions they catalyze. For this purpose the NC-IUBMB typically describes an exemplary reaction for each class of enzymes, with the understanding that individual members of the class may use alternative reactants. The NC-IUBMB use their own names for the reactants. To allow UniProt to curate reactions at the level of specific enzymes instead of enzyme classes, and to use standardized names for reactants, we now use chemical reaction descriptions from the Rhea database whenever possible. Rhea uses the ChEBI (Chemical Entities of Biological Interest) ontology to describe reaction participants that are small molecules as well as the reactive groups of large molecules (such as amino acid residues within proteins). These large molecules are identified by a RHEA-COMP identifier. For catalytic activities that can only be described in the form of free text, we continue to follow the NC-IUBMB descriptions. We have also started to curate the physiological direction of a reaction, i.e. the direction of the net flow of reactants in vivo, where evidence for it is available.

Due to their focus on nomenclature, cross-references to Enzyme Commission (EC) numbers have historically been added to the Protein names subsection of UniProtKB entries. To link the EC numbers to the reactions on which they are based, we now also add them to ‘Catalytic activity’ annotations.

‘Catalytic activity’ annotations are found in UniProtKB entries, as well as in UniRule and SAAS annotation rules.

Below is a description of how this change affects the different file formats in which UniProt entries are distributed.

Text format

Note: Regex symbols indicate whether a pattern (as delimited by parentheses) is optional (?) or may occur 1 or more times (+).

Reaction description from Rhea:

 CC   -!- CATALYTIC ACTIVITY:
 CC       Reaction=<RheaText>; Xref=<RheaXref>(, <ReactantXref>)+;
 CC        ( EC=<EcNumber>;)?( Evidence={<Evidences>};)?
(CC       PhysiologicalDirection=left-to-right; Xref=<RheaXref>; Evidence={<Evidences>};)
(CC       PhysiologicalDirection=right-to-left; Xref=<RheaXref>; Evidence={<Evidences>};)

Where:

  • <RheaText>: Textual representation of an undirectional Rhea reaction.
  • <RheaXref>: Cross-reference to a Rhea reaction (Rhea:n).
  • <ReactantXref>: Cross-reference to a reactant from ChEBI (CHEBI:n) or Rhea (RHEA-COMP:n).
  • <EcNumber>: EC number of the corresponding enzyme class, when available.
  • <Evidences>: List of evidences, when available.

Example: O36015

Previous format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY: S-adenosyl-L-methionine +
CC       cytidine(32)/guanosine(34) in tRNA = S-adenosyl-L-homocysteine +
CC       2'-O-methylcytidine(32)/2'-O-methylguanosine(34) in tRNA.
CC       {ECO:0000255|HAMAP-Rule:MF_03162}.

New format (based on Rhea):

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=cytidine(32)/guanosine(34) in tRNA + 2 S-adenosyl-L-
CC         methionine = 2'-O-methylcytidine(32)/2'-O-methylguanosine(34) in
CC         tRNA + 2 H(+) + 2 S-adenosyl-L-homocysteine;
CC         Xref=Rhea:RHEA:42396, Rhea:RHEA-COMP:10246, Rhea:RHEA-
CC         COMP:10247, ChEBI:CHEBI:15378, ChEBI:CHEBI:57856,
CC         ChEBI:CHEBI:59789, ChEBI:CHEBI:74269, ChEBI:CHEBI:74445,
CC         ChEBI:CHEBI:74495, ChEBI:CHEBI:82748; EC=2.1.1.205;
CC         Evidence={ECO:0000255|HAMAP-Rule:MF_03162};

Example: A0A0S3QTD0

Previous format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY: Acetyl-CoA + H(2)O + oxaloacetate = citrate +
CC       CoA. {ECO:0000269|PubMed:29420286}.

New format (based on Rhea):

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=acetyl-CoA + H2O + oxaloacetate = citrate + CoA + H(+);
CC         Xref=Rhea:RHEA:16845, ChEBI:CHEBI:15377, ChEBI:CHEBI:15378,
CC         ChEBI:CHEBI:16452, ChEBI:CHEBI:16947, ChEBI:CHEBI:57287,
CC         ChEBI:CHEBI:57288; EC=2.3.3.16;
CC         Evidence={ECO:0000269|PubMed:29420286};
CC       PhysiologicalDirection=left-to-right; Xref=Rhea:RHEA:16846;
CC         Evidence={ECO:0000269|PubMed:29420286};
CC       PhysiologicalDirection=right-to-left; Xref=Rhea:RHEA:16847;
CC         Evidence={ECO:0000269|PubMed:29420286};

Reaction description from NC-IUBMB:

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=<IUBMBText>; EC=<EcNumber>;( Evidence={<Evidences>};)?

Where:

  • <IUBMBText>: An NC-IUBMB reaction description.
  • <EcNumber>: EC number of the corresponding enzyme class.
  • <Evidences>: List of evidences, when available.

Example: P17050

Previous format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY: Cleavage of non-reducing alpha-(1->3)-N-
CC       acetylgalactosamine residues from human blood group A and AB mucin
CC       glycoproteins, Forssman hapten and blood group A lacto series
CC       glycolipids. {ECO:0000269|PubMed:19683538}.

New format (based on NC-IUBMB):

CC   -!- CATALYTIC ACTIVITY:
CC       Reaction=Cleavage of non-reducing alpha-(1->3)-N-
CC         acetylgalactosamine residues from human blood group A and AB
CC         mucin glycoproteins, Forssman hapten and blood group A lacto
CC         series glycolipids.; EC=3.2.1.49;
CC         Evidence={ECO:0000269|PubMed:19683538};

XML format

We have extended the UniProt XSD with new elements and types as shown below in red color:

    <xs:complexType name="commentType">
        ...
        <xs:sequence>
            <xs:element name="molecule" type="moleculeType" minOccurs="0"/>
            <xs:choice minOccurs="0">
                ...
                <xs:sequence>
                    <xs:annotation>
                        <xs:documentation>Used in 'catalytic activity' annotations.</xs:documentation>
                    </xs:annotation>
                    <xs:element name="reaction" type="reactionType"/>
                    <xs:element name="physiologicalReaction" type="physiologicalReactionType" minOccurs="0" maxOccurs="2"/>
                </xs:sequence>
                ...
            </xs:choice>
            ...
        </xs:sequence>
        ...
    </xs:complexType>
    ...
    <xs:complexType name="reactionType">
        <xs:annotation>
            <xs:documentation>Describes a chemical reaction.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="text" type="xs:string"/>
            <xs:element name="dbReference" type="dbReferenceType" minOccurs="1" maxOccurs="unbounded"/>
        </xs:sequence>
        <xs:attribute name="evidence" type="intListType" use="optional"/>
    </xs:complexType>

    <xs:complexType name="physiologicalReactionType">
        <xs:annotation>
            <xs:documentation>Describes a physiological reaction.</xs:documentation>
        </xs:annotation>
        <xs:sequence>
            <xs:element name="dbReference" type="dbReferenceType"/>
        </xs:sequence>
        <xs:attribute name="direction" use="required">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="left-to-right"/>
                    <xs:enumeration value="right-to-left"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:attribute>
        <xs:attribute name="evidence" type="intListType" use="optional"/>
    </xs:complexType>

Reaction description from Rhea:

Example: O36015

Previous format (based on NC-IUBMB):

<comment type="catalytic activity">
  <text evidence="1">S-adenosyl-L-methionine + cytidine(32)/guanosine(34) in tRNA = S-adenosyl-L-homocysteine + 2'-O-methylcytidine(32)/2'-O-methylguanosine(34) in tRNA.</text>
</comment>

New format (based on Rhea):

<comment type="catalytic activity">
  <reaction evidence="1">
    <text>cytidine(32)/guanosine(34) in tRNA + 2 S-adenosyl-L-methionine = 2'-O-methylcytidine(32)/2'-O-methylguanosine(34) in tRNA + 2 H(+) + 2 S-adenosyl-L-homocysteine</text>
    <dbReference type="Rhea" id="RHEA:42396"/>
    <dbReference type="Rhea" id="RHEA-COMP:10246"/>
    <dbReference type="Rhea" id="RHEA-COMP:10247"/>
    <dbReference type="ChEBI" id="CHEBI:15378"/>
    <dbReference type="ChEBI" id="CHEBI:57856"/>
    <dbReference type="ChEBI" id="CHEBI:59789"/>
    <dbReference type="ChEBI" id="CHEBI:74269"/>
    <dbReference type="ChEBI" id="CHEBI:74445"/>
    <dbReference type="ChEBI" id="CHEBI:74495"/>
    <dbReference type="ChEBI" id="CHEBI:82748"/>
    <dbReference type="EC" id="2.1.1.205"/>
  </reaction>
</comment>

Example: A0A0S3QTD0

Previous format (based on NC-IUBMB):

<comment type="catalytic activity">
  <text evidence="2">Acetyl-CoA + H(2)O + oxaloacetate = citrate + CoA.</text>
</comment>

New format (based on Rhea):

<comment type="catalytic activity">
  <reaction evidence="2">
    <text>acetyl-CoA + H2O + oxaloacetate = citrate + CoA + H(+)</text>
    <dbReference type="Rhea" id="RHEA:16845"/>
    <dbReference type="ChEBI" id="CHEBI:15377"/>
    <dbReference type="ChEBI" id="CHEBI:15378"/>
    <dbReference type="ChEBI" id="CHEBI:16452"/>
    <dbReference type="ChEBI" id="CHEBI:16947"/>
    <dbReference type="ChEBI" id="CHEBI:57287"/>
    <dbReference type="ChEBI" id="CHEBI:57288"/>
    <dbReference type="EC" id="2.3.3.16"/>
  </reaction>
  <physiologicalReaction direction="left-to-right" evidence="2">
    <dbReference type="Rhea" id="RHEA:16846"/>
  </physiologicalReaction>
  <physiologicalReaction direction="right-to-left" evidence="2">
    <dbReference type="Rhea" id="RHEA:16847"/>
  </physiologicalReaction>
</comment>

Reaction description from NC-IUBMB:

Example: P17050

Previous format (based on NC-IUBMB):

<comment type="catalytic activity">
  <text evidence="6">Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids.</text>
</comment>

New format (based on NC-IUBMB):

<comment type="catalytic activity">
  <reaction evidence="6">
    <text>Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids.</text>
    <dbReference type="EC" id="3.2.1.49"/>
  </reaction>
</comment>

RDF format

Note: Evidence-related statements are omitted since their format does not change. In the previous format, evidence was attributed via reification of the rdfs:comment statement. In the new format, the up:catalyticActivity and up:catalyzedPhysiologicalReaction statements are reified.

Reaction description from Rhea:

Example: O36015

Previous format (based on NC-IUBMB):

uniprot:O36015
  up:annotation <O36015#SIP5A4ED6FF66BBF481> .

<O36015#SIP5A4ED6FF66BBF481>
  rdf:type up:Catalytic_Activity_Annotation ;
  rdfs:comment "S-adenosyl-L-methionine + cytidine(32)/guanosine(34) in tRNA = S-adenosyl-L-homocysteine + 2'-O-methylcytidine(32)/2'-O-methylguanosine(34) in tRNA." .

New format (based on Rhea):

uniprot:O36015
  up:annotation <O36015#SIP962CEE3C69B2533E> .

<O36015#SIP962CEE3C69B2533E>
  rdf:type up:Catalytic_Activity_Annotation ;
  up:catalyticActivity <O36015#SIP6D2D3E976AAD17F0> .

<O36015#SIP6D2D3E976AAD17F0>
  rdf:type up:Catalytic_Activity ;
  up:catalyzedReaction <http://rdf.rhea-db.org/42396> ;
  up:enzymeClass enzyme:2.1.1.205 .

Example: A0A0S3QTD0

Previous format (based on NC-IUBMB):

uniprot:A0A0S3QTD0
  up:annotation <A0A0S3QTD0#SIPF04A1EC4C8EBCB08> .

<A0A0S3QTD0#SIPF04A1EC4C8EBCB08>
  rdf:type up:Catalytic_Activity_Annotation ;
  rdfs:comment "Acetyl-CoA + H(2)O + oxaloacetate = citrate + CoA." .

New format (based on Rhea):

uniprot:A0A0S3QTD0
  up:annotation <A0A0S3QTD0#SIP8171B3125ADE4E9D> .

<A0A0S3QTD0#SIP8171B3125ADE4E9D>
  rdf:type up:Catalytic_Activity_Annotation ;
  up:catalyticActivity <A0A0S3QTD0#SIP1A91565011EC50F6> ;
  up:catalyzedPhysiologicalReaction <http://rdf.rhea-db.org/16846> ,
                                    <http://rdf.rhea-db.org/16847> .

<A0A0S3QTD0#SIP1A91565011EC50F6>
  rdf:type up:Catalytic_Activity ;
  up:catalyzedReaction <http://rdf.rhea-db.org/16845> ;
  up:enzymeClass enzyme:2.3.3.16 .

Reaction description from NC-IUBMB:

Example: P17050

Previous format (based on NC-IUBMB):

uniprot:P17050
  up:annotation <P17050#SIP0FD272930B1683DE> .

<P17050#SIP0FD272930B1683DE>
  rdf:type up:Catalytic_Activity_Annotation ;
  rdfs:comment "Cleavage of non-reducing alpha-(1->3)-N-acetylgalactosamine residues from human blood group A and AB mucin glycoproteins, Forssman hapten and blood group A lacto series glycolipids." .
  

New format (based on NC-IUBMB):

uniprot:P17050
  up:annotation <P17050#SIP0FD272930B1683DE> .

<P17050#SIP0FD272930B1683DE>
  rdf:type up:Catalytic_Activity_Annotation ;
  up:catalyticActivity <P17050#SIP0FD272930B1683DF> .

<P17050#SIP0FD272930B1683DF>
  rdf:type up:Catalytic_Activity ;
  skos:closeMatch enzyme:3.2.1.49#SIP0FD272930B1683DG ;
  up:enzymeClass enzyme:3.2.1.49 .

Change of the RDF representation of enzyme related data

We have changed the RDF representation of ENZYME records in order to refer from UniProt ‘Catalytic activity’ annotations to individual enzymatic activities. The range of the activity predicate has been changed to the type Catalytic_Activity.

Example: 1.11.1.21

Previous format:

enzyme:1.11.1.21
  rdf:type up:Enzyme ;
  skos:prefLabel "Catalase peroxidase" ;
  up:activity "Donor + H(2)O(2) = oxidized donor + 2 H(2)O." ;
  up:activity "2 H(2)O(2) = O(2) + 2 H(2)O." ;
  ...

New format:

enzyme:1.11.1.21
  rdf:type up:Enzyme ;
  skos:prefLabel "Catalase peroxidase" ;
  up:activity <1.11.1.21#SIP017EC216DF0EDC2A> ;
  up:activity <1.11.1.21#SIP018ED427AB1BAS3X> ;
  ...

<1.11.1.21#SIP017EC216DF0EDC2A>
  rdf:type up:Catalytic_Activity ;
  rdfs:label "Donor + H(2)O(2) = oxidized donor + 2 H(2)O." .

<1.11.1.21#SIP018ED427AB1BAS3X>
  rdf:type up:Catalytic_Activity ;
  rdfs:label "2 H(2)O(2) = O(2) + 2 H(2)O." .

Changes to the controlled vocabulary of human diseases

New diseases:

Modified diseases:

Deleted diseases

  • Deafness, autosomal recessive, 105

Changes to the controlled vocabulary for PTMs

New term for the feature key ‘Modified residue’ (‘MOD_RES’ in the flat file):

  • Murein peptidoglycan amidated serine

Changes in subcellular location controlled vocabulary

New subcellular location:

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again