Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Introduction

Number of entries
New entries 5,177,933
Updated entries 16,500,700
Unchanged entries 103,118,475
Total 124,797,108
Entries with updated sequences 670
With a fragmented AA sequence 11,623,697
With known alternative products 0
Protein Existence (PE) Number of entries
1 Evidence at protein level 144,459
2 Evidence at transcript level 1,162,753
3 Inferred from homology 30,704,463
4 Predicted 92,785,433
5 Uncertain 0

Taxonomic Origin


Statistics on the number of species

Number of species in
New entries 31,697
Updated entries 163,877
Unchanged entries 611,512
Total 705,327

Sequence data

The shortest sequence is C4PYW0 at 2 AA while the longest sequence is A0A1V4K6M4 at 36,991 AA

Some annotation statistics

General Annotation (comments)

Annotations Entries
Allergenic properties 0 0
Alternative products 0 0
Biophysicochemical properties 0 0
Biotechnological use 0 0
Catalytic activity 14,381,512 13,091,072
Caution 70,151,363 68,574,919
Cofactor 9,298,602 0
Developmental stage 0 0
Involvement in disease 0 0
Disruption phenotype 0 0
Domain 955,264 901,760
Enzyme regulation 0 0
Function 16,578,674 15,797,912
Induction 60,227 60,227
Mass spectrometry 0 0
Miscellaneous 566,341 557,689
Pathway 7,332,557 6,600,394
Pharmaceutical use 0 0
Polymorphism 0 0
Post-translational modification 865,602 716,418
RNA Editing 0 0
Sequence caution 0 0
Sequence similarities 30,758,219 30,336,990
Subcellular Location 0 0
Subunit structure 8,673,600 8,580,451
Tissue specificity 0 0
Toxic dose 0 0

Sequence Annotation (featues)

Annotations Entries
Molecule processing 18,153,622 9,099,380
Chain 9,060,261 9,048,384
Initiator methionine 39,076 39,076
Peptide 715 450
Propeptide 17,692 17,692
Signal peptide 9,035,741 9,035,731
Transit peptide 137 137
Regions 238,784,447 82,030,329
Calcium binding 266,678 131,835
Coiled-coil 18,211,431 12,132,671
Compositional bias 4,682 4,682
DNA binding 3,158,468 2,799,079
Domain 87,687,613 63,302,921
Motif 1,425,739 994,209
Nucleotide binding 7,126,807 4,503,539
Repeat 4,821,710 1,152,766
Region 5,235,851 2,785,307
Topological domain 263,694 101,919
Transmembrane 110,147,613 24,205,755
Zinc finger 432,989 341,631
Sites 39,515,757 8,654,394
Active site 7,757,911 4,731,702
Metal binding 13,136,672 3,495,881
Binding site 16,581,714 4,203,990
Other 2,039,460 1,212,136
Amino acid modifications 4,620,318 2,617,961
Cross-link 29,523 27,492
Disulfide bond 1,914,045 513,206
Glycosylation 21,350 20,202
Lipidation 294,637 154,470
Modified residue 2,355,129 2,116,658
Non-standard residue 5,634 5,441
Experimental info 17,510,863 11,704,150
Mutagenesis 0 0
Non-adjacent residues 0 0
Non-terminal residue 17,423,045 11,676,371
Sequence conflict 0 0
Sequence uncertainty 87,818 73,176

Citation usage

Citation type Citations Entries
Submission105,470,28593,482,577
Journal article40,477,39138,305,854
Book11,37411,309
Thesis14,95114,892
Patent11
Unpublished observations00
Online journal article00

Additional automatically mapped literature

Citation type Citations Entries
Journal articles 739,240 447,382

For information about which journals are used in citing or mapping to UniProtKB see the journals section.

Database Cross-Reference Statistics

DatabaseEntities linked toEntries
Sequence databases
EMBL137,136,197120,773,326
PIR162,677130,437
RefSeq44,968,91843,861,624
UniGene865,999732,249
3D structure databases
DisProt9696
PDB37,48918,336
PDBsum37,02318,029
ProteinModelPortal7,191,9507,191,950
SMR1,239,7131,239,713
Protein-protein interaction databases
CORUM114114
ComplexPortal182133
DIP3,2163,215
ELM107107
IntAct26,64026,640
MINT2,8292,829
STRING6,441,4186,441,169
Chemistry
BindingDB271271
ChEMBL965965
DrugBank742449
GuidetoPHARMACOLOGY44
SwissLipids8282
Protein family/group databases
Allergome3,9493,184
CAZy129,045120,758
ESTHER74,49074,192
MEROPS243,185243,184
MoonDB11
MoonProt6464
PeroxiBase2,4752,467
REBASE31,41331,401
TCDB8,1868,175
UniLectin156156
mycoCLAP447447
PTM databases
CarbonylDB265265
GlyConnect1313
PhosphoSitePlus2,2402,240
SwissPalm1,9041,904
UniCarbKB1717
iPTMnet5,1365,136
Polymorphism and mutation databases
2D gel databases
COMPLUYEAST-2DPAGE44
OGP33
REPRODUCTION-2DPAGE6261
SWISS-2DPAGE11
World-2DPAGE316311
Proteomic databases
EPD14,08514,085
MaxQB42,22942,229
PRIDE336,418336,418
PaxDb328,104328,104
PeptideAtlas128,755128,755
ProMEX3,2303,230
TopDownProteomics280280
Protocols and materials databases
DNASU41,27940,840
Genome annotation databases
Ensembl1,908,0031,864,078
EnsemblBacteria39,072,24036,859,779
EnsemblFungi6,182,7886,076,488
EnsemblMetazoa1,179,1131,126,925
EnsemblPlants2,164,4101,964,895
EnsemblProtists1,872,7851,760,840
GeneDB114,675112,895
GeneID10,640,02210,533,434
Gramene2,164,4101,964,895
KEGG16,276,30115,844,081
PATRIC17,342,00317,331,375
UCSC93,12392,918
VectorBase580,334561,628
WBParaSite854,112845,705
Organism-specific databases
ArachnoServer200200
Araport15,22815,161
CGD20,80120,735
CTD1,139,4021,137,430
ConoServer159159
EuPathDB671,599670,904
FlyBase208,332207,035
GeneCards1,3101,291
H-InvDB587440
HGNC51,98651,890
LegioList2,4962,483
Leproma1,2711,269
MGI61,79361,358
MIM44
MalaCards1212
OpenTargets49,92349,874
PharmGKB3,1343,134
PomBase22
PseudoCAP4,4494,445
RGD21,62020,727
SGD77
TAIR11,89211,830
TubercuList1,000999
VGNC79,92979,929
WormBase55,88455,500
Xenbase34,43234,353
ZFIN52,64952,520
dictyBase7,9877,765
euHCVdb75,26775,264
Phylogenomic databases
GeneTree1,831,5241,831,446
HOGENOM2,996,7202,996,639
HOVERGEN300,367300,354
InParanoid2,347,0292,347,029
KO7,159,3967,130,103
OMA6,851,6506,851,573
OrthoDB14,233,31914,233,199
PhylomeDB461,341461,341
TreeFam558,538558,502
eggNOG13,807,0436,920,731
Enzyme and pathway databases
BRENDA9,5689,278
BioCyc6,070,7006,052,419
Reactome327,177116,140
SABIO-RK644644
SIGNOR77
SignaLink3,7953,795
UniPathway7,306,5336,578,194
Other
ChiTaRS131,460131,459
EvolutionaryTrace5,9435,943
GenomeRNAi29,99729,997
PMAP-CutDB130130
PRO2,2592,259
Gene expression databases
Bgee531,282531,273
CollecTF199199
ExpressionAtlas638,053637,794
Genevisible15,84215,835
Ontologies
Family and domain databases
CDD22,325,54619,613,375
Gene3D53,820,16144,806,380
HAMAP13,735,49013,580,999
InterPro315,189,00396,517,965
PANTHER25,728,13024,835,119
PIRSF10,831,81910,741,297
PRINTS16,287,82414,696,318
PROSITE62,184,46741,476,922
Pfam121,298,02788,142,439
ProDom1,747,1091,674,847
SFLD1,120,242580,497
SMART29,400,22122,336,824
SUPFAM80,659,93563,852,314
TIGRFAMs25,843,44823,771,886

Web resource

0 UniProtKB/TrEMBL entries have at least one link to a webpage of general interest on the protein.

Amino acid distribution statistics

  • 9.1%Alanine
  • 5.7%Arginine
  • 3.8%Asparagine
  • 5.4%Aspartate
  • 1.1%Cysteine
  • 3.7%Glutamine
  • 6.1%Glutamate
  • 7.3%Glycine
  • 2.1%Histidine
  • 5.6%Isoleucine
  • 9.8%Leucine
  • 4.9%Lysine
  • 2.3%Methionine
  • 3.9%Phenylalanine
  • 4.8%Proline
  • 6.6%Serine
  • 5.5%Threonine
  • 1.3%Tryptophan
  • 2.9%Tyrosine
  • 6.9%Valine
  • Aliphatic
  • Acidic
  • Small hydroxy
  • Basic
  • Amide
  • Aromatic
  • Sulfur

Miscellaneous Statistics

1,909,093 entries are encoded on a mitochondrion, and 762,415 are encoded on a plasmid.

773,306 entries are encoded on a plastid, of which 785 are encoded on apicoplasts, 648,339 on chloroplasts, 1 on organellar chromatophores, 8 on cyanelles, 1,521 on non-photosynthetic plastids and 3,190 on unspecified types of plastid.

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again