Where can I find information about the enzymes that catalyze Rhea reactions?
In Rhea, information about the enzymes that catalyze reactions are available from two sources:
- UniProtKB which describes protein sequences and their catalyzed-reactions
- ENZYME which describes IUBMB Enzyme classification (EC numbers)
Figure 1
Content:
- Introduction
- Origin of data
- Rhea content
- How to access enzyme-related data?
- From the Rhea web site
- In a reaction page
- Search for EC number
- Search for UniProt accession
- Mine Rhea content as a whole
- From a reaction result table search
- From the Download page
- Related documents
Introduction
We encourage you to watch this excellent video published by PDB-101, the Educational portal of the RCSB Protein Data Bank centre:
- How enzymes work?
It will show you how enzymes catalyze reactions with an animation of the reaction catalyzed by aconitase (aconitate hydratase, P20004 (ACON_BOVIN), EC 4.2.1.3, RHEA:10336)
and to read our article
- Enzyme annotation in UniProtKB using Rhea
published in Nucleic Acids Res. in 2019 (PubMed: 31688925)
Origin of data
1. UniProtKB enzyme sequences
Since Dec 2018, UniProt project uses Rhea to describe the catalyzed reactions of enzymes in the UniProt Knowledgebase (UniProtKB).
- UniProt curators are providing expert-curated links between Rhea reactions and UniProtKB/Swiss-Prot protein sequences.
- UniProt automated annotation projects (HAMAP, etc) provide links between Rhea reactions and UniProtKB/TrEMBL protein sequences.
Click here to retrieve the whole set of UniProt enzymes annotated with Rhea reactions.
In the 2020_02 UniProt release, 218,251 UniProtKB/Swiss-Prot entries are annotated with Rhea reactions (expert). Additionally, more than 17.5 millions of unreviewed sequences (UniProtKB/TrEMBL) are predicted to be linked to Rhea reactions (computation).
Saying that does not prevent us to be aware that we are just at the beginning of a long path! Curation of enzymes is a high priority for the UniProt consortium and UniProtKB/Swiss-Prot curators are the main requestors for new Rhea reactions.
Note for UniProt users:
UniProt users can search UniProtKB using the names and identifiers of compounds (from ChEBI) and the identifiers of Rhea reactions; find out more at the UniProt documentation and news page.
2. ENZYME Enzyme classification
The NC-IUBMB is in charge of classifying enzymes according to the reaction(s) they catalyze.
The ENZYME database is a SIB project that describes this enzyme classification. There is a strong collaborations between ENZYME and Rhea projects. ENZYME and Rhea curators translates the textual descriptions of the reactions of the enzyme classification into Rhea reactions, hence providing an expert-curated links between EC numbers and Rhea reactions (see more details here).
3. Curation workflow
ENZYME database is routinely updated with IUBMB data, hence adding, updating or deleting EC numbers.
Whenever possible, IUBMB reactions are translated to Rhea reactions. The reactions may be updated accordingly to IUBMB changes.
EC numbers and Rhea reactions are used to annotate the functions of enzyme-related sequences in the UniProtKB.
In case of changes in ENZYME or Rhea, the UniProt entries are updated accordingly.
We do our best to get consistent and synchronized data between the three resources.
Figure 2
Rhea content
To give you an idea of ​​trends, the figure below shows you the distribution of enzyme-catalyzed reactions as of April 2020 (Rhea release 112 and UniProt & ENZYME release 2020_02).
The pie-chart shows that 71% of Rhea reactions (8,892/12,616) are linked to either an EC number, a UniProt entry or both.
Among them,
- more than 5,000 reactions are linked to both EC number and UniProt entries
- 17% are linked to UniProt entries only (i.e. they are no associated to EC number)
- 14.3% of the Rhea reactions are linked to EC numbers only (i.e. they are not (yet) used in UniProt)
Figure 3: enzyme-catalyzed annotations (as of April 2020 release)
In the next sections, you will learn how to access the enzyme-catalyzed reactions, get these numbers using the Rhea search tool or through programmatic access.
How to access enzyme-related data?
From the Rhea web site
In a reaction page
Enzyme related data are displayed at the top of a reaction entry page as well as in the Links to other resources section as illustrated by Figure 1.
Search for EC number
You need to use the prefix 'ec:' in the simple search. You can search for the 4 levels of enzyme classification.
and you can use the wildcard * to retrieve all the reactions linked to an EC number.
Examples:
Query | Result |
---|---|
ec:* | retrieve all the reactions linked to an EC number |
ec:2 or ec:2.-.-.- | retrieve all the reactions catalyzed by Transferases |
ec:2.1 or ec:2.1.-.- | retrieve all the reactions catalyzed by Transferases Transferring one-carbon groups |
ec:2.1.1 or ec:2.1.1.- | retrieve all the reactions catalyzed by Methyltransferases |
ec:2.1.1.160 | retrieve all the reactions catalyzed by Caffeine synthase |
Search for UniProt accession
Simply enter a UniProt accession in the simple search. You can also use the prefix 'uniprot:' to ensure that your query term will be searched for UniProt accession specifically.
Examples:
Query | Result |
---|---|
query=P08159 | retrieve the reaction(s) catalyzed by P08159 UniProt entry |
query=uniprot:Q9Y6P5 | retrieve the reaction(s catalyzed by Q9Y6P5 UniProt entry |
query=uniprot:* | retrieve the whole set of Rhea reactions linked to UniProt. |
Mine Rhea content as a whole
To retrieve the data described in the pie-chart presented Figure 3, you can performed the following queries
Query | Result |
---|---|
query=ec:* | retrieve the whole set of Rhea reactions linked to an EC number |
query=uniprot:* | retrieve the whole set of Rhea reactions linked to UniProt |
query=uniprot:* and ec:* | retrieve the whole set of Rhea reactions linked to UniProt and EC numbers |
query=uniprot:* not ec | retrieve the set of Rhea reactions linked to UniProt but EC numbers |
query=ec:* not uniprot | retrieve the set of Rhea reactions linked to EC numbers but not EC numbers |
query=not ec:* not uniprot:* | retrieve the whole set of Rhea reactions not linked to UniProt nor EC numbers |
From a reaction result table search
In a reaction result page (Figure 4), you can access to enzyme information by different ways:
Filter results
- (1) Restrict your search to reactions linked to one of the seven main classes of enzyme classification.
- (2) Restrict your search to unclassified reactions, i.e. reactions not linked to an EC number.
Find enzymes
(3) This functionality allow to search UniProtKB with a selected set of reactions.
Once you have selected the Rhea reactions of interest and clicked the Find enzymes button, you will be redirected to the UniProt web site with a query matching your reactions selection.
Reaction table
The columns Enzyme class or EC number display enzyme classification data.
The column Enzymes displays the number of protein sequences annotated with the reaction in the UniProt KnowledgeBase.
(4) reactions with EC number only
(5) reactions not classified in ENZYME/IUBMB but used in UniProtKB to annotate enzyme sequences
(6) reactions both linked to EC numbers & UniProtKB
Figure 4: How to access enzyme data from a result page?
From the Download page
Rhea provides several export in tab-separated (TSV) format.
Enzyme classification:
UniProt accessions:
- rhea2uniprot_sprot.tsv (UniProtKB/Swiss-Prot)
- rhea2uniprot_trembl.tsv.gz (UniProtKB/TrEMBL)