Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Introduction

Number of entries
New entries 2,641,656
Updated entries 57,249,993
Unchanged entries 27,399,683
Total 87,291,332
Entries with updated sequences 3,745
With a fragmented AA sequence 8,947,736
With known alternative products 0
Protein Existence (PE) Number of entries
1 Evidence at protein level 128,383
2 Evidence at transcript level 1,084,733
3 Inferred from homology 21,022,078
4 Predicted 65,056,138
5 Uncertain 0

Taxonomic Origin


Statistics on the number of species

Number of species in
New entries 11,990
Updated entries 110,158
Unchanged entries 526,989
Total 559,930

Sequence data

The shortest sequence is C4PYW0 at 2 AA while the longest sequence is A0A1V4K6M4 at 36,991 AA

Some annotation statistics

General Annotation (comments)

Annotations Entries
Allergenic properties 0 0
Alternative products 0 0
Biophysicochemical properties 0 0
Biotechnological use 0 0
Catalytic activity 9,793,168 8,972,610
Caution 44,950,512 43,981,235
Cofactor 6,684,289 0
Developmental stage 0 0
Involvement in disease 0 0
Disruption phenotype 0 0
Domain 601,537 578,487
Enzyme regulation 189,892 189,890
Function 10,950,784 10,593,934
Induction 41,306 41,306
Mass spectrometry 0 0
Miscellaneous 333,538 329,019
Pathway 4,950,617 4,504,262
Pharmaceutical use 0 0
Polymorphism 0 0
Post-translational modification 442,444 398,589
RNA Editing 0 0
Sequence caution 0 0
Sequence similarities 21,134,363 20,878,921
Subcellular Location 0 0
Subunit structure 5,849,439 5,818,126
Tissue specificity 0 0
Toxic dose 0 0

Sequence Annotation (featues)

Annotations Entries
Molecule processing 11,725,129 5,874,938
Chain 5,848,586 5,846,476
Initiator methionine 21,415 21,415
Peptide 76 76
Propeptide 10,538 10,538
Signal peptide 5,844,420 5,844,411
Transit peptide 94 94
Regions 159,503,219 56,270,852
Calcium binding 205,895 101,264
Coiled-coil 5,767,228 3,873,631
Compositional bias 3,467 3,467
DNA binding 2,116,621 1,866,093
Domain 61,885,769 44,714,565
Motif 506,616 370,619
Nucleotide binding 4,444,417 2,877,744
Repeat 2,351,070 660,762
Region 3,000,006 1,571,621
Topological domain 90,356 29,516
Transmembrane 78,857,564 17,404,521
Zinc finger 273,291 206,932
Sites 23,683,024 5,171,640
Active site 4,610,518 2,812,547
Metal binding 7,951,981 2,139,172
Binding site 10,025,438 2,560,596
Other 1,095,087 610,914
Amino acid modifications 1,589,757 944,902
Cross-link 18,972 17,350
Disulfide bond 825,613 225,259
Glycosylation 958 426
Lipidation 16,014 14,372
Modified residue 725,608 699,329
Non-standard residue 2,592 2,401
Experimental info 14,006,089 8,999,714
Mutagenesis 0 0
Non-adjacent residues 0 0
Non-terminal residue 13,945,202 8,986,462
Sequence conflict 0 0
Sequence uncertainty 60,887 51,302

Citation usage

Citation type Citations Entries
Submission69,782,02060,623,056
Journal article33,767,17231,880,862
Book11,26011,195
Thesis12,07112,012
Patent11
Unpublished observations00
Online journal article00

Additional automatically mapped literature

Citation type Citations Entries
Journal articles 672,669 476,758

For information about which journals are used in citing or mapping to UniProtKB see the journals section.

Database Cross-Reference Statistics

DatabaseEntities linked toEntries
Sequence databases
EMBL98,567,87184,215,245
PIR163,350131,096
RefSeq44,174,37943,202,643
UniGene717,362617,371
3D structure databases
PDB33,13616,452
PDBsum32,76716,243
ProteinModelPortal7,649,1597,649,159
SMR1,039,6971,039,697
Protein-protein interaction databases
DIP3,2923,286
IntAct24,41224,412
MINT9,7629,761
STRING7,200,4747,200,264
Chemistry
BindingDB188188
ChEMBL871871
DrugBank540318
GuidetoPHARMACOLOGY44
SwissLipids7474
Protein family/group databases
Allergome3,8743,143
CAZy129,629121,314
ESTHER70,76170,465
MEROPS252,079252,078
MoonProt33
PeroxiBase2,4832,475
REBASE32,42032,404
TCDB7,7257,709
mycoCLAP448448
PTM databases
PhosphoSitePlus2,2432,243
SwissPalm1,2201,220
UniCarbKB1717
iPTMnet4,9814,981
Polymorphism and mutation databases
2D gel databases
COMPLUYEAST-2DPAGE44
OGP33
REPRODUCTION-2DPAGE6463
SWISS-2DPAGE11
World-2DPAGE317312
Proteomic databases
EPD7,1917,191
MaxQB39,94439,944
PRIDE277,254277,254
PaxDb602,348602,348
PeptideAtlas119,460119,460
ProMEX3,0613,061
TopDownProteomics283283
Protocols and materials databases
DNASU41,38840,949
Genome annotation databases
Ensembl1,227,7601,204,798
EnsemblBacteria41,298,01839,061,612
EnsemblFungi5,491,9675,343,820
EnsemblMetazoa1,061,2741,036,111
EnsemblPlants1,758,8321,644,444
EnsemblProtists1,858,0631,749,169
GeneDB114,837113,058
GeneID9,213,5829,105,561
Gramene1,758,8691,644,477
KEGG13,397,79412,987,412
PATRIC18,556,16818,556,090
UCSC94,41194,211
VectorBase566,910554,528
WBParaSite854,121845,711
Organism-specific databases
ArachnoServer203203
Araport19,74519,661
CGD20,81620,750
CTD745,329743,549
ConoServer160160
EuPathDB583,482583,482
FlyBase222,771221,306
H-InvDB590443
HGNC50,79250,693
LegioList2,4962,483
Leproma1,2711,269
MGI60,05759,682
MIM44
MalaCards99
OpenTargets48,87548,824
PharmGKB3,1543,154
PomBase3232
PseudoCAP4,4654,459
RGD25,16323,835
SGD77
TAIR15,93315,855
TubercuList1,0051,004
WormBase65,78365,393
Xenbase26,63526,577
ZFIN53,01152,356
dictyBase7,9887,766
euHCVdb75,26775,264
Phylogenomic databases
GeneTree1,207,0991,206,968
HOGENOM3,046,9153,046,820
HOVERGEN300,747300,735
InParanoid2,527,4212,527,316
KO5,740,6195,716,550
OMA6,524,8106,524,803
OrthoDB14,648,06114,648,030
PhylomeDB470,750470,750
TreeFam577,776577,762
eggNOG14,285,6887,160,145
Enzyme and pathway databases
BRENDA9,6529,361
BioCyc3,484,7543,483,511
Reactome241,47187,961
SABIO-RK599599
SIGNOR77
SignaLink3,8233,823
UniPathway4,941,1844,494,829
Other
ChiTaRS86,27986,120
EvolutionaryTrace6,0276,027
GenomeRNAi30,32830,328
PMAP-CutDB131131
PRO2,2592,259
Gene expression databases
Bgee359,970359,919
CollecTF203203
ExpressionAtlas235,936235,928
Genevisible16,36116,361
Ontologies
Family and domain databases
CDD10,667,82110,158,127
Gene3D34,654,27629,170,362
HAMAP8,657,5528,548,187
InterPro194,313,86267,990,044
PANTHER13,781,93213,245,482
PIRSF7,281,9567,221,461
PRINTS11,888,44410,716,432
PROSITE43,834,63229,113,734
Pfam84,870,01761,860,174
ProDom1,334,6821,271,160
SFLD559,811368,303
SMART20,775,71315,817,210
SUPFAM56,050,28244,368,799
TIGRFAMs17,610,63616,181,380

Web resource

0 UniProtKB/TrEMBL entries have at least one link to a webpage of general interest on the protein.

Amino acid distribution statistics

  • 9.0%Alanine
  • 5.6%Arginine
  • 3.9%Asparagine
  • 5.4%Aspartate
  • 1.2%Cysteine
  • 3.8%Glutamine
  • 6.1%Glutamate
  • 7.2%Glycine
  • 2.2%Histidine
  • 5.7%Isoleucine
  • 9.8%Leucine
  • 5.0%Lysine
  • 2.3%Methionine
  • 3.9%Phenylalanine
  • 4.8%Proline
  • 6.7%Serine
  • 5.5%Threonine
  • 1.2%Tryptophan
  • 2.9%Tyrosine
  • 6.8%Valine
  • Aliphatic
  • Acidic
  • Small hydroxy
  • Basic
  • Amide
  • Aromatic
  • Sulfur

Miscellaneous Statistics

1,628,973 entries are encoded on a mitochondrion, and 593,586 are encoded on a plasmid.

585,594 entries are encoded on a plastid, of which 785 are encoded on apicoplasts, 489,659 on chloroplasts, 1 on organellar chromatophores, 8 on cyanelles, 1,601 on non-photosynthetic plastids and 3,156 on unspecified types of plastid.