Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Release 9.0 of the UniProt Knowledgebase is composed of the UniProtKB/Swiss-Prot Protein Knowledgebase release 51.0 and the UniProtKB/TrEMBL Protein Database release 34.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase?.


UniProtKB/Swiss-Prot protein knowledgebase release 51.0 statistics

Release 51.0 of 31-Oct-06 of UniProtKB/Swiss-Prot contains 241'242 sequence entries, comprising 88'541'632 amino acids abstracted from 148'048 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672
48.0 09/05 194'317 70'391'852
49.0 02/06 207'132 75'438'310
50.0 05/06 222'289 81'585'146
51.0 10/06 241'242 88'541'632

In rare cases, UniProtKB/Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from UniProtKB/Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProtKB/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

  • be as complete as possible. All sequences available at a given time should be immediately included in UniProtKB/Swiss-Prot. This also includes sequence corrections and updates;
  • provide a higher level of annotation;
  • provide cross-references to specialized database(s) that contain, among other data, some information about the genes that code for these proteins;
  • provide specific indexes and documents.

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana TAIR arath.txt 4'551
C.albicans None yet calbican.txt 572
C.elegans Wormpep celegans.txt 2'966
D.discoideum DictyBase dicty.txt 332
D.melanogaster FlyBase fly.txt 2'436
M.musculus MGD mgdtosp.txt 11'897
S.cerevisiae SGD yeast.txt 5'916
S.pombe GeneDB_SPombe pombe.txt 3'082

UniProtKB/Swiss-Prot release statistics
                    
                    1.  INTRODUCTION
                    
                    Release 51.0 of 31-Oct-06 of UniProtKB/Swiss-Prot contains 241242 sequence entries,
                    comprising 88541632 amino acids abstracted from 148048 references.
                    
                    19061 sequences have been added since release 50.0, the sequence data of
                    1336 existing entries has been updated and the annotations of
                    222181 entries have been revised.
                    
                    The growth of the database is summarized below.
                    
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 7.89   Gln (Q) 3.95   Leu (L) 9.65   Ser (S) 6.82
                    Arg (R) 5.40   Glu (E) 6.67   Lys (K) 5.92   Thr (T) 5.41
                    Asn (N) 4.13   Gly (G) 6.96   Met (M) 2.38   Trp (W) 1.13
                    Asp (D) 5.35   His (H) 2.29   Phe (F) 3.96   Tyr (Y) 3.03
                    Cys (C) 1.50   Ile (I) 5.90   Pro (P) 4.83   Val (V) 6.73
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Thr, Arg, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 10671
                    
                    The first twenty species represent 76403 sequences:  31.7 % of the total
                    number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5274
                    2x: 1616
                    3x:  788
                    4x:  486
                    5x:  340
                    6x:  295
                    7x:  202
                    8x:  177
                    9x:  156
                    10x:   80
                    11- 20x:  416
                    21- 50x:  340
                    51-100x:  133
                    >100x:  368
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      14987  Homo sapiens (Human)
                    2      11897  Mus musculus (Mouse)
                    3       5916  Saccharomyces cerevisiae (Baker's yeast)
                    4       5528  Rattus norvegicus (Rat)
                    5       4877  Escherichia coli
                    6       4551  Arabidopsis thaliana (Mouse-ear cress)
                    7       3082  Schizosaccharomyces pombe (Fission yeast)
                    8       2966  Caenorhabditis elegans
                    9       2872  Bos taurus (Bovine)
                    10       2842  Bacillus subtilis
                    11       2436  Drosophila melanogaster (Fruit fly)
                    12       1837  Escherichia coli O157:H7
                    13       1782  Methanococcus jannaschii
                    14       1774  Haemophilus influenzae
                    15       1587  Salmonella typhimurium
                    16       1556  Gallus gallus (Chicken)
                    17       1509  Escherichia coli O6
                    18       1508  Xenopus laevis (African clawed frog)
                    19       1486  Shigella flexneri
                    20       1410  Mycobacterium tuberculosis
                    21       1347  Pongo pygmaeus (Orangutan)
                    22       1182  Salmonella typhi
                    23       1153  Mycobacterium bovis
                    24       1105  Sus scrofa (Pig)
                    25       1089  Pseudomonas aeruginosa
                    26       1014  Oryza sativa (Rice)
                    27        971  Archaeoglobus fulgidus
                    28        970  Synechocystis sp. (strain PCC 6803)
                    29        930  Brachydanio rerio (Zebrafish) (Danio rerio)
                    30        884  Mimivirus
                    31        866  Yersinia pestis
                    32        863  Vibrio cholerae
                    33        857  Rhizobium meliloti (Sinorhizobium meliloti)
                    34        807  Oryctolagus cuniculus (Rabbit)
                    35        754  Aquifex aeolicus
                    36        723  Pasteurella multocida
                    37        707  Vibrio parahaemolyticus
                    38        690  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    39        688  Staphylococcus aureus (strain N315)
                    40        687  Mycoplasma pneumoniae
                    41        677  Streptomyces coelicolor
                    42        672  Staphylococcus aureus (strain MW2)
                    43        670  Staphylococcus aureus (strain COL)
                    44        669  Staphylococcus aureus (strain MRSA252)
                    45        668  Staphylococcus aureus (strain MSSA476)
                    46        660  Bacillus halodurans
                    47        659  Canis familiaris (Dog)
                    48        655  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
                    49        650  Vibrio vulnificus
                    50        631  Vibrio vulnificus (strain YJ016)
                    51        630  Mycobacterium leprae
                    52        612  Anabaena sp. (strain PCC 7120)
                    53        608  Treponema pallidum
                    54        589  Pseudomonas putida (strain KT2440)
                    55        589  Pseudomonas syringae pv. tomato
                    56        587  Bacillus anthracis
                    57        587  Methanobacterium thermoautotrophicum
                    58        581  Neurospora crassa
                    59        577  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    60        577  Staphylococcus epidermidis (strain ATCC 12228)
                    61        572  Buchnera aphidicola subsp. Acyrthosiphon pisum
                    62        572  Candida albicans (Yeast)
                    63        570  Helicobacter pylori (Campylobacter pylori)
                    64        569  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    65        568  Photorhabdus luminescens subsp. laumondii
                    66        565  Bradyrhizobium japonicum
                    67        562  Pan troglodytes (Chimpanzee)
                    68        562  Buchnera aphidicola subsp. Schizaphis graminum
                    69        561  Yersinia pseudotuberculosis
                    70        551  Helicobacter pylori J99 (Campylobacter pylori J99)
                    71        551  Ralstonia solanacearum (Pseudomonas solanacearum)
                    72        549  Rickettsia prowazekii
                    73        548  Zea mays (Maize)
                    74        548  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    75        543  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    76        540  Rhizobium loti (Mesorhizobium loti)
                    77        539  Listeria monocytogenes
                    78        535  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    79        531  Listeria innocua
                    80        528  Xanthomonas campestris pv. campestris
                    81        518  Neisseria meningitidis serogroup A
                    82        517  Neisseria meningitidis serogroup B
                    83        516  Shewanella oneidensis
                    84        512  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    85        507  Buchnera aphidicola subsp. Baizongia pistaciae
                    86        507  Clostridium acetobutylicum
                    87        505  Caulobacter crescentus (Caulobacter vibrioides)
                    88        501  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    89        491  Xanthomonas axonopodis pv. citri
                    90        484  Candida glabrata (Yeast) (Torulopsis glabrata)
                    91        483  Mycoplasma genitalium
                    92        483  Thermotoga maritima
                    93        483  Salmonella paratyphi-a
                    94        478  Streptococcus pneumoniae
                    95        471  Xylella fastidiosa
                    96        470  Listeria monocytogenes serotype 4b (strain F2365)
                    97        462  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    98        461  Deinococcus radiodurans
                    99        460  Brucella melitensis
                    100        460  Oceanobacillus iheyensis
                    101        460  Brucella suis
                    102        452  Haemophilus ducreyi
                    103        448  Methanosarcina acetivorans
                    104        446  Pyrococcus horikoshii
                    105        443  Corynebacterium glutamicum (Brevibacterium flavum)
                    106        441  Pyrococcus abyssi
                    107        441  Clostridium perfringens
                    108        439  Halobacterium salinarium (Halobacterium halobium)
                    109        435  Chlamydia trachomatis
                    110        429  Methanosarcina mazei (Methanosarcina frisia)
                    111        426  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    112        421  Borrelia burgdorferi (Lyme disease spirochete)
                    113        420  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    114        420  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    115        417  Nicotiana tabacum (Common tobacco)
                    116        416  Pyrococcus furiosus
                    117        415  Chlamydia pneumoniae (Chlamydophila pneumoniae)
                    118        414  Chromobacterium violaceum
                    119        413  Bordetella parapertussis
                    120        413  Bordetella pertussis
                    121        411  Thermoanaerobacter tengcongensis
                    122        410  Bacillus cereus (strain ATCC 10987)
                    123        410  Lactobacillus plantarum
                    124        409  Synechococcus elongatus (Thermosynechococcus elongatus)
                    125        406  Chlamydia muridarum
                    126        405  Emericella nidulans (Aspergillus nidulans)
                    127        405  Rhizobium sp. (strain NGR234)
                    128        404  Campylobacter jejuni
                    129        401  Streptococcus pyogenes serotype M6
                    130        401  Streptococcus mutans
                    131        401  Ovis aries (Sheep)
                    132        400  Enterococcus faecalis (Streptococcus faecalis)
                    133        395  Sulfolobus solfataricus
                    134        395  Streptomyces avermitilis
                    135        395  Salmonella choleraesuis
                    136        393  Yarrowia lipolytica (Candida lipolytica)
                    137        389  Streptococcus pyogenes serotype M1
                    138        384  Streptococcus pyogenes serotype M18
                    139        383  Streptococcus pyogenes serotype M3
                    140        380  Rickettsia conorii
                    141        374  Bacillus thuringiensis subsp. konkukian
                    142        365  Chlorobium tepidum
                    143        361  Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
                    144        360  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    145        360  Corynebacterium efficiens
                    146        356  Rhodopseudomonas palustris
                    147        356  Nitrosomonas europaea
                    148        354  Acinetobacter sp. (strain ADP1)
                    149        350  Methanopyrus kandleri
                    150        348  Aeropyrum pernix
                    151        347  Leptospira interrogans
                    152        342  Gloeobacter violaceus
                    153        341  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    154        341  Bacillus cereus (strain ZK / E33L)
                    155        339  Pisum sativum (Garden pea)
                    156        337  Leptospira interrogans serogroup Icterohaemorrhagiae serovar copenhageni
                    157        332  Dictyostelium discoideum (Slime mold)
                    158        332  Solanum lycopersicum (Tomato) (Lycopersicon esculentum)
                    159        331  Streptococcus agalactiae serotype III
                    160        331  Bacillus clausii (strain KSM-K16)
                    161        329  Streptococcus agalactiae serotype V
                    162        328  Synechococcus sp. (strain WH8102)
                    163        328  Sulfolobus tokodaii
                    164        326  Mannheimia succiniciproducens (strain MBEL55E)
                    165        321  Prochlorococcus marinus (strain MIT 9313)
                    166        321  Prochlorococcus marinus
                    167        319  Burkholderia mallei (Pseudomonas mallei)
                    168        318  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    169        313  Methylococcus capsulatus
                    170        313  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    171        312  Vibrio fischeri (strain ATCC 700601 / ES114)
                    172        311  Thermoplasma acidophilum
                    173        309  Staphylococcus aureus
                    174        308  Rhodopirellula baltica
                    175        305  Triticum aestivum (Wheat)
                    176        302  Fusobacterium nucleatum subsp. nucleatum
                    177        300  Mycobacterium paratuberculosis
                    178        300  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
                    179        300  Geobacillus kaustophilus
                    180        298  Synechococcus sp. (strain ATCC 27144 / PCC 6301 / SAUG 1402/1)
                    181        297  Coxiella burnetii
                    182        297  Staphylococcus haemolyticus (strain JCSC1435)
                    183        297  Macaca mulatta (Rhesus macaque)
                    184        297  Geobacter sulfurreducens
                    185        292  Glycine max (Soybean)
                    186        291  Staphylococcus saprophyticus subsp. saprophyticus
                    187        290  Aspergillus fumigatus (Sartorya fumigata)
                    188        287  Sulfolobus acidocaldarius
                    189        286  Idiomarina loihiensis
                    190        286  Solanum tuberosum (Potato)
                    191        286  Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)
                    192        284  Pseudomonas putida
                    193        283  Bacteroides thetaiotaomicron
                    194        279  Wolinella succinogenes
                    195        279  Pyrobaculum aerophilum
                    196        278  Cavia porcellus (Guinea pig)
                    197        278  Nocardia farcinica
                    198        278  Hordeum vulgare (Barley)
                    199        277  Zymomonas mobilis
                    200        277  Clostridium tetani
                    201        275  Thermoplasma volcanium
                    202        274  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
                    203        269  Synechococcus sp. (strain PCC 7942) (Anacystis nidulans R2)
                    204        268  Bacteriophage T4
                    205        267  Symbiobacterium thermophilum
                    206        267  Spinacia oleracea (Spinach)
                    207        266  Corynebacterium diphtheriae
                    208        266  Shigella sonnei (strain Ss046)
                    209        261  Thermus thermophilus (strain HB27 / ATCC BAA-163 / DSM 7039)
                    210        261  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
                    211        259  Azoarcus sp. (strain EbN1)
                    212        259  Brucella abortus
                    213        256  Legionella pneumophila subsp. pneumophila
                    214        255  Silicibacter pomeroyi
                    215        255  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    216        255  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
                    217        254  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    218        254  Vaccinia virus (strain Copenhagen) (VACV)
                    219        254  Wigglesworthia glossinidia brevipalpis
                    220        251  Haloarcula marismortui (Halobacterium marismortui)
                    221        251  Legionella pneumophila (strain Paris)
                    222        251  Helicobacter hepaticus
                    223        251  Methanococcus maripaludis
                    224        250  Xanthomonas oryzae pv. oryzae
                    225        249  Equus caballus (Horse)
                    226        249  Shigella boydii serotype 4 (strain Sb227)
                    227        249  Legionella pneumophila (strain Lens)
                    228        248  Bifidobacterium longum
                    229        247  Neisseria gonorrhoeae (strain ATCC 700825 / FA 1090)
                    230        245  Pseudomonas syringae pv. syringae (strain B728a)
                    231        242  Porphyromonas gingivalis (Bacteroides gingivalis)
                    232        241  Shigella dysenteriae serotype 1 (strain Sd197)
                    233        240  Chlamydophila caviae
                    234        240  Leifsonia xyli subsp. xyli
                    235        236  Haemophilus influenzae (strain 86-028NP)
                    236        235  Bacillus stearothermophilus (Geobacillus stearothermophilus)
                    237        232  Bacteroides fragilis
                    238        231  Blochmannia floridanus
                    239        229  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    240        228  Gluconobacter oxydans (Gluconobacter suboxydans)
                    241        226  Campylobacter jejuni (strain RM1221)
                    242        225  Lactobacillus johnsonii
                    243        224  Propionibacterium acnes
                    244        223  Bartonella henselae (Rochalimaea henselae)
                    245        223  Desulfotalea psychrophila
                    246        222  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    247        220  Porphyra purpurea
                    248        220  Chlamydomonas reinhardtii
                    249        216  Gorilla gorilla gorilla (Lowland gorilla)
                    250        213  Cryptococcus neoformans (Filobasidiella neoformans)
                    251        212  Bartonella quintana (Rochalimaea quintana)
                    252        212  Pseudomonas fluorescens (strain PfO-1)
                    253        211  Klebsiella pneumoniae
                    254        210  Xanthomonas campestris pv. campestris (strain 8004)
                    255        207  Cricetulus griseus (Chinese hamster)
                    256        206  Burkholderia sp. (strain 383) (Burkholderia cepacia
                    257        206  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    258        205  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
                    259        203  Bdellovibrio bacteriovorus
                    260        201  Felis silvestris catus (Cat)
                    261        200  Vaccinia virus (strain Western Reserve / WR) (VACV)
                    262        200  Streptococcus thermophilus (strain ATCC BAA-250 / LMG 18311)
                    
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           10690 (  4%)
                    Bacteria         116347 ( 48%)
                    Eukaryota        103579 ( 43%)
                    Viruses           10626 (  4%)
                    
                    
                    Within Eukaryota:
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  14988 ( 14%)           (  6%)
                    Other Mammalia         32159 ( 31%)           ( 13%)
                    Other Vertebrata        9382 (  9%)           (  4%)
                    Viridiplantae          16436 ( 16%)           (  7%)
                    Fungi                  16083 ( 16%)           (  7%)
                    Insecta                 4691 (  5%)           (  2%)
                    Nematoda                3376 (  3%)           (  1%)
                    Other                   6464 (  6%)           (  3%)
                    
                    
                    4.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    4507             1001-1100     2086
                    51- 100   16972             1101-1200     1370
                    101- 150   25049             1201-1300     1062
                    151- 200   23742             1301-1400      887
                    201- 250   24479             1401-1500      740
                    251- 300   20223             1501-1600      376
                    301- 350   21179             1601-1700      275
                    351- 400   19510             1701-1800      219
                    401- 450   15270             1801-1900      211
                    451- 500   13104             1901-2000      172
                    501- 550    9799             2001-2100      118
                    551- 600    6807             2101-2200      177
                    601- 650    5760             2201-2300      160
                    651- 700    3886             2301-2400      106
                    701- 750    3213             2401-2500       90
                    751- 800    2663             >2500          627
                    801- 850    2287
                    851- 900    2428
                    901- 950    1811
                    951-1000    1433
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 367 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_HUMAN (Q8WZ42): 34350 amino acids.
                    
                    
                    5.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1756
                    
                    
                    5.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  615
                    2x:  231
                    3x:  130
                    4x:   89
                    5x:   64
                    6x:   48
                    7x:   38
                    8x:   29
                    9x:   31
                    10x:   21
                    11- 20x:  120
                    21- 50x:  151
                    51-100x:   62
                    >100x:  127
                    
                    
                    5.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        14192   Journal of Biological Chemistry
                    2         6814   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         4398   Journal of Bacteriology
                    4         4128   Gene
                    5         4011   Nucleic Acids Research
                    6         3687   Biochemical and Biophysical Research Communications
                    7         3459   FEBS Letters
                    8         3183   Biochemistry
                    9         3108   The EMBO Journal
                    10         2902   European Journal of Biochemistry
                    11         2727   Nature
                    12         2601   Biochimica et Biophysica Acta
                    13         2570   Molecular and Cellular Biology
                    14         2424   Journal of Molecular Biology
                    15         2250   Genomics
                    16         2195   Cell
                    17         1772   Biochemical Journal
                    18         1666   Science
                    19         1443   Molecular Microbiology
                    20         1329   Plant Molecular Biology
                    21         1265   Molecular and General Genetics
                    22         1192   Journal of Cell Biology
                    23         1177   Journal of Virology
                    24         1081   Virology
                    25         1062   Human Molecular Genetics
                    26         1059   Journal of Biochemistry
                    27         1010   Nature Genetics
                    28         1004   Genes and Development
                    29          906   Plant Physiology
                    30          904   Oncogene
                    31          873   The American Journal of Human Genetics
                    32          802   Human Mutation
                    33          763   Journal of Immunology
                    34          737   Infection and Immunity
                    35          726   Development
                    36          703   Structure
                    37          699   Genetics
                    38          681   Yeast
                    39          675   Archives of Biochemistry and Biophysics
                    40          641   Journal of General Virology
                    41          603   Microbiology
                    42          585   Molecular Biology of the Cell
                    43          551   FEMS Microbiology Letters
                    44          544   Blood
                    45          536   Nature Structural Biology
                    46          525   The Plant Cell
                    47          487   Human Genetics
                    48          475   Current Genetics
                    49          468   Cancer Research
                    50          467   Journal of Cell Science
                    51          465   Molecular Cell
                    52          444   Developmental Biology
                    53          429   Applied and Environmental Microbiology
                    54          426   Mechanisms of Development
                    55          426   The Plant Journal
                    56          413   Journal of Clinical Investigation
                    57          413   Protein Science
                    58          409   Neuron
                    59          406   Mammalian Genome
                    60          406   Acta Crystallographica, Section D
                    61          400   Molecular and Biochemical Parasitology
                    62          383   Molecular Endocrinology
                    63          376   Journal of Neuroscience
                    64          372   The Journal of Experimental Medicine
                    65          370   Current Biology
                    66          364   Immunogenetics
                    67          341   Journal of Molecular Evolution
                    68          333   Endocrinology
                    69          333   DNA and Cell Biology
                    70          322   Journal of Neurochemistry
                    71          307   DNA Sequence
                    72          291   The Journal of Clinical Endocrinology and Metabolism
                    73          291   American Journal of Physiology
                    74          285   Biological Chemistry Hoppe-Seyler
                    75          282   Toxicon
                    76          281   Molecular Biology and Evolution
                    77          274   Bioscience, Biotechnology, and Biochemistry
                    78          273   Brain Research. Molecular Brain Research
                    79          247   Cytogenetics and Cell Genetics
                    80          242   Journal of General Microbiology
                    81          231   Comparative Biochemistry and Physiology
                    82          229   Proteins
                    83          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    84          215   Antimicrobial Agents and Chemotherapy
                    85          212   Journal of Medical Genetics
                    86          210   Molecular Pharmacology
                    87          205   Peptides
                    88          193   Journal of Investigative Dermatology
                    89          186   Biology of Reproduction
                    90          181   Plant and Cell Physiology
                    91          181   DNA Research
                    92          180   Genome Research
                    93          178   Molecular Plant-Microbe Interactions
                    94          171   Nature Cell Biology
                    95          171   European Journal of Immunology
                    96          169   Virus Research
                    97          158   Experimental Cell Research
                    98          158   Tissue Antigens
                    99          158   DNA
                    100          157   Biochimie
                    101          150   RNA
                    102          146   Molecular and Cellular Endocrinology
                    103          146   Molecular Phylogenetics and Evolution
                    104          145   Hemoglobin
                    105          144   Bioorganicheskaia Khimiia
                    106          143   American Journal of Medical Genetics
                    107          137   Archives of Microbiology
                    108          134   Neurology
                    109          133   Annals of Neurology
                    110          132   Developmental Dynamics
                    111          131   European Journal of Human Genetics
                    112          129   Insect Biochemistry and Molecular Biology
                    113          126   Journal of Human Genetics
                    114          124   Genes to Cells
                    115          123   Immunity
                    116          118   Agricultural and Biological Chemistry
                    117          117   Molecular Reproduction and Development
                    118          116   General and Comparative Endocrinology
                    119          116   Animal Genetics
                    120          115   Planta
                    121          112   Diabetes
                    122          110   Molecular Immunology
                    123          108   Glycobiology
                    124          107   Developmental Cell
                    125          106   Investigative Ophthalmology and Visual Science
                    126          103   Journal of Protein Chemistry
                    127          101   The New England Journal of Medicine
                    
                    
                    6.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                     475775              1.97
                    Journal                          417892    219561    1.73
                    Submitted to EMBL/GenBank/DDBJ    54275     47414    0.22
                    Submitted to Swiss-Prot             788       784   <0.01
                    Unpublished observations            629       623   <0.01
                    Submitted to other databases        579       566   <0.01
                    Book citation                       566       554   <0.01
                    Plant Gene Register                 531       519   <0.01
                    Thesis                              378       376   <0.01
                    Patent                              131       129   <0.01
                    Worm Breeder's Gazette                6         6   <0.01
                    
                    Comments (CC)                       967004              4.01
                    SIMILARITY                       271184    219367    1.12
                    FUNCTION                         169309    163414    0.70
                    SUBCELLULAR LOCATION             130620    130620    0.54
                    CATALYTIC ACTIVITY                91194     84076    0.38
                    SUBUNIT                           87937     87937    0.36
                    PATHWAY                           48385     41451    0.20
                    COFACTOR                          36561     32578    0.15
                    TISSUE SPECIFICITY                23913     23913    0.10
                    MISCELLANEOUS                     20879     18880    0.09
                    PTM                               19119     15678    0.08
                    DOMAIN                            14102     12187    0.06
                    ALTERNATIVE PRODUCTS              10235     10235    0.04
                    CAUTION                            8915      8197    0.04
                    INDUCTION                          6672      6672    0.03
                    INTERACTION                        5931      5931    0.02
                    DEVELOPMENTAL STAGE                5926      5926    0.02
                    ENZYME REGULATION                  3541      3541    0.01
                    DISEASE                            3457      2516    0.01
                    WEB RESOURCE                       3060      2533    0.01
                    MASS SPECTROMETRY                  2556      2135    0.01
                    BIOPHYSICOCHEMICAL PROPERTIES      1564      1564    0.01
                    POLYMORPHISM                        562       549   <0.01
                    RNA EDITING                         457       457   <0.01
                    ALLERGEN                            413       413   <0.01
                    TOXIC DOSE                          307       304   <0.01
                    BIOTECHNOLOGY                       136       136   <0.01
                    PHARMACEUTICAL                       69        69   <0.01
                    
                    Features (FT)                      1694975              7.03
                    CHAIN                            245076    237799    1.02
                    TRANSMEM                         152755     33630    0.63
                    TURN                             117303      9164    0.49
                    METAL                            104372     25294    0.43
                    STRAND                            90666      8491    0.38
                    HELIX                             86054      8898    0.36
                    CONFLICT                          85085     29528    0.35
                    TOPO_DOM                          80831     16416    0.34
                    DOMAIN                            76562     41407    0.32
                    CARBOHYD                          72770     18269    0.30
                    DISULFID                          71207     18191    0.30
                    ACT_SITE                          55572     32657    0.23
                    REPEAT                            51933      7667    0.22
                    BINDING                           47911     19347    0.20
                    MOD_RES                           42295     18871    0.18
                    VARIANT                           41927      8508    0.17
                    NP_BIND                           35175     25159    0.15
                    REGION                            35073     18260    0.15
                    COMPBIAS                          23586     13408    0.10
                    SIGNAL                            23081     23071    0.10
                    VAR_SEQ                           22047      9609    0.09
                    MUTAGEN                           17867      4388    0.07
                    MOTIF                             17158     11356    0.07
                    ZN_FING                           16881      6588    0.07
                    SITE                              14257      8162    0.06
                    NON_TER                           10836      8297    0.04
                    INIT_MET                          10172     10172    0.04
                    COILED                             8808      5634    0.04
                    PROPEP                             7394      6204    0.03
                    LIPID                              6883      4535    0.03
                    DNA_BIND                           6558      6123    0.03
                    PEPTIDE                            6429      3966    0.03
                    TRANSIT                            4212      4175    0.02
                    CA_BIND                            2640      1086    0.01
                    CROSSLNK                           1743      1175    0.01
                    NON_CONS                           1150       519   <0.01
                    UNSURE                              457       178   <0.01
                    SE_CYS                              249       180   <0.01
                    
                    Cross-references (DR)              3117026             12.92
                    InterPro                         579873    222611    2.40
                    EMBL                             456684    232945    1.89
                    Pfam                             304763    215339    1.26
                    PROSITE                          224649    137580    0.93
                    GO                               212471     91220    0.88
                    GenomeReviews                    138640    122934    0.57
                    KEGG                             113059    102134    0.47
                    PIR                               97026     90613    0.40
                    TIGRFAMs                          94134     88121    0.39
                    HAMAP                             92615     92497    0.38
                    PRINTS                            89163     70134    0.37
                    HSSP                              78938     78938    0.33
                    SMART                             71541     54152    0.30
                    BioCyc                            70591     65335    0.29
                    ProDom                            57638     55724    0.24
                    Ensembl                           42511     42498    0.18
                    UniGene                           38777     36129    0.16
                    PANTHER                           38091     37880    0.16
                    PDB                               36752     10060    0.15
                    SMR                               34082     34082    0.14
                    ArrayExpress                      33838     33838    0.14
                    RZPD-ProtExp                      25639     12023    0.11
                    TIGR                              22645     22052    0.09
                    PIRSF                             19888     19634    0.08
                    LinkHub                           17389     17388    0.07
                    HGNC                              14412     14352    0.06
                    MIM                               12287     10033    0.05
                    MGI                               11746     11700    0.05
                    IntAct                            10997     10997    0.05
                    SGD5974      5906    0.02
                    MEROPS                             5241      4936    0.02
                    RGD5225      5222    0.02
                    GermOnline                         4925      4879    0.02
                    TAIR                               4609      4521    0.02
                    EcoGene                            4259      4256    0.02
                    EchoBASE                           4160      4128    0.02
                    H-InvDB                            3677      3659    0.02
                    WormPep                            3566      2963    0.01
                    WormBase                           3195      3114    0.01
                    FlyBase                            3164      3040    0.01
                    GeneDB_Spombe                      3115      3080    0.01
                    TRANSFAC                           2862      2569    0.01
                    SubtiList                          2784      2783    0.01
                    Gramene                            2675      2675    0.01
                    GeneFarm                           1761      1742    0.01
                    StyGene                            1543      1539    0.01
                    HPA1480      1320    0.01
                    TubercuList                        1438      1402    0.01
                    SWISS-2DPAGE                       1170      1170   <0.01
                    ListiList                          1071      1063   <0.01
                    Reactome                           1003      1003   <0.01
                    ZFIN917       907   <0.01
                    Leproma                             633       630   <0.01
                    AGD575       569   <0.01
                    PhotoList                           568       568   <0.01
                    LegioList                           500       500   <0.01
                    MaizeDB                             439       434   <0.01
                    OGP375       374   <0.01
                    HIV361       356   <0.01
                    REBASE                              353       349   <0.01
                    ECO2DBASE                           351       299   <0.01
                    DictyBase                           334       331   <0.01
                    SagaList                            332       331   <0.01
                    GlycoSuiteDB                        282       282   <0.01
                    PeroxiBase                          265       258   <0.01
                    PHCI-2DPAGE                         241       241   <0.01
                    MypuList                            189       189   <0.01
                    Aarhus/Ghent-2DPAGE                 128        98   <0.01
                    Siena-2DPAGE                        103       103   <0.01
                    HSC-2DPAGE                           85        85   <0.01
                    PhosSite                             70        70   <0.01
                    COMPLUYEAST-2DPAGE                   59        59   <0.01
                    PMMA-2DPAGE                          52        52   <0.01
                    PptaseDB                             29        29   <0.01
                    Rat-heart-2DPAGE                     28        28   <0.01
                    ANU-2DPAGE                           21        21   <0.01
                    
                    Number of explicitly cross-referenced databases: 78
                    Number of implicitly cross-referenced databases: 27
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 230300
                    
                    Total number of entries encoded on a Mitochondrion: 4085
                    Total number of entries encoded on a Plasmid: 3160
                    Total number of entries encoded on a Plastid: 26
                    Total number of entries encoded on a Plastid; Apicoplast: 6
                    Total number of entries encoded on a Plastid; Chloroplast: 5862
                    Total number of entries encoded on a Plastid; Cyanelle: 145
                    Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 90
                    
                    Number of fragments: 8444
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage: 16655 
                    
                

UniProtKB/TrEMBL protein database release 34.0 statistics

                    
                    1.  INTRODUCTION
                    
                    Release 34.0 of 31-Oct-2006 of UniProtKB/TrEMBL contains 3313264 sequence entries
                    comprising 1073273937 amino acids.
                    
                    497407 sequences have been added since release 33, the sequence data of
                    2732 existing entries has been updated and the annotations of
                    2815857 entries have been revised. This represents an increase of 18%.
                    
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 8.30   Gln (Q) 3.93   Leu (L) 9.81   Ser (S) 6.92
                    Arg (R) 5.52   Glu (E) 6.04   Lys (K) 5.27   Thr (T) 5.65
                    Asn (N) 4.32   Gly (G) 7.01   Met (M) 2.40   Trp (W) 1.34
                    Asp (D) 5.21   His (H) 2.24   Phe (F) 4.05   Tyr (Y) 3.04
                    Cys (C) 1.40   Ile (I) 5.94   Pro (P) 4.88   Val (V) 6.59
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.05
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Ser, Val, Glu, Ile, Thr, Arg, Lys, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/TrEMBL: 119998
                    
                    The first twenty species represent  673630 sequences:  20.3 % of the
                    total number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x:55767
                    2x:22572
                    3x:11680
                    4x: 6456
                    5x: 3574
                    6x: 2775
                    7x: 1989
                    8x: 1662
                    9x: 1274
                    10x: 1259
                    11- 20x: 6025
                    21- 50x: 2493
                    51-100x: 1023
                    >100x: 1449
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1     162793  Human immunodeficiency virus 1
                    2      71887  Oryza sativa (japonica cultivar-group)
                    3      55035  Homo sapiens (Human)
                    4      47627  Mus musculus (Mouse)
                    5      44945  Arabidopsis thaliana (Mouse-ear cress)
                    6      32207  Hepatitis C virus
                    7      28028  Tetraodon nigroviridis (Green puffer)
                    8      27313  Tetrahymena thermophila SB210
                    9      24948  Drosophila melanogaster (Fruit fly)
                    10      20246  Caenorhabditis elegans
                    11      20134  Trypanosoma cruzi
                    12      17387  Medicago truncatula (Barrel medic)
                    13      16934  Brachydanio rerio (Zebrafish) (Danio rerio)
                    14      16817  Aedes aegypti (Yellowfever mosquito)
                    15      16450  Phaeosphaeria nodorum SN15
                    16      15078  Anopheles gambiae str. PEST
                    17      14942  uncultured bacterium
                    18      14666  Plasmodium chabaudi
                    19      13103  Caenorhabditis briggsae
                    20      13090  Dictyostelium discoideum AX4
                    21      12866  Hepatitis B virus (HBV)
                    22      12285  Xenopus laevis (African clawed frog)
                    23      12042  Aspergillus oryzae
                    24      11773  Plasmodium berghei
                    25      11656  Gibberella zeae (Fusarium graminearum)
                    26      11001  Chaetomium globosum CBS 148.51
                    27      10779  Neurospora crassa
                    28      10404  Aspergillus terreus NIH2624
                    29      10299  Coccidioides immitis RS
                    30      10060  Drosophila pseudoobscura (Fruit fly)
                    31      10030  Aspergillus fumigatus (Sartorya fumigata)
                    32       9704  Schistosoma japonicum (Blood fluke)
                    33       9671  Emericella nidulans (Aspergillus nidulans)
                    34       9449  Trypanosoma brucei
                    35       9386  Candida albicans (Yeast)
                    36       9325  Rattus norvegicus (Rat)
                    37       9089  Entamoeba histolytica HM-1:IMSS
                    38       9042  Rhodococcus sp. (strain RHA1)
                    39       9000  Escherichia coli
                    40       8513  Burkholderia xenovorans (strain LB400)
                    41       8512  Stigmatella aurantiaca DW4/3-1
                    42       8217  Bos taurus (Bovine)
                    43       8109  Bradyrhizobium japonicum
                    44       8063  Solibacter usitatus Ellin6076
                    45       7937  Frankia sp. EAN1pec
                    46       7809  Plasmodium yoelii yoelii
                    47       7663  Burkholderia vietnamiensis G4
                    48       7533  Streptomyces coelicolor
                    49       7509  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    50       7432  Bradyrhizobium sp. BTAi1
                    51       7314  Streptomyces avermitilis
                    52       7262  Myxococcus xanthus (strain DK 1622)
                    53       7152  Rhizobium loti (Mesorhizobium loti)
                    54       7106  Leishmania major
                    55       7062  Rhizobium leguminosarum bv. viciae (strain 3841)
                    56       7049  Burkholderia cenocepacia HI2424
                    57       6963  Rhodopirellula baltica
                    58       6951  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    59       6776  Pseudomonas aeruginosa
                    60       6711  Frankia alni ACN14a
                    61       6679  Psychroflexus torquis ATCC 700755
                    62       6629  Hahella chejuensis (strain KCTC 2396)
                    63       6607  Burkholderia cepacia AMMD
                    64       6545  Ustilago maydis (Smut fungus)
                    65       6419  Cryptococcus neoformans (Filobasidiella neoformans)
                    66       6394  Giardia lamblia ATCC 50803
                    67       6393  Burkholderia cenocepacia (strain AU 1054)
                    68       6383  Cryptococcus neoformans var. neoformans B-3501A
                    69       6337  Sinorhizobium medicae WSM419
                    70       6280  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    71       6225  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    72       6219  Ralstonia metallidurans (strain CH34 / ATCC 43123 / DSM 2839)
                    73       6217  Yarrowia lipolytica (Candida lipolytica)
                    74       6204  Bacillus anthracis
                    75       6201  Ralstonia eutropha H16
                    76       6153  Simian immunodeficiency virus (isolate CPZ GAB1) (SIV-cpz) 
                    77       6150  Burkholderia pseudomallei (strain 1710b)
                    78       6129  Bacillus thuringiensis serovar israelensis ATCC 35646
                    79       6025  Plasmodium falciparum
                    80       5989  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    81       5979  Mycobacterium vanbaalenii PYR-1
                    82       5936  Yersinia pestis
                    83       5904  Bacillus cereus G9241
                    84       5896  Rhizobium meliloti (Sinorhizobium meliloti)
                    85       5881  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    86       5852  Mycobacterium sp. KMS
                    87       5811  Rhizobium etli (strain CFN 42 / ATCC 51251)
                    88       5696  Crocosphaera watsonii
                    89       5689  Bacillus sp. NRRL B-14911
                    90       5687  Mycobacterium sp. JLS
                    91       5665  Nocardia farcinica
                    92       5599  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    93       5590  Mycobacterium sp. (strain MCS)
                    94       5589  Helicobacter pylori (Campylobacter pylori)
                    95       5553  Gallus gallus (Chicken)
                    96       5538  Photobacterium profundum 3TCK
                    97       5534  Anabaena sp. (strain PCC 7120)
                    98       5523  Bacillus weihenstephanensis KBAB4
                    99       5516  Pseudomonas fluorescens (strain PfO-1)
                    100       5513  Mycobacterium flavescens PYR-GCK
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    Kingdom        sequences (% of the database)
                    Archaea           74858 (  2%)
                    Bacteria        1612809 ( 49%)
                    Eukaryota       1184862 ( 36%)
                    Viruses          437391 ( 13%)
                    Other              3342 ( <1%)
                    
                    
                    
                    Within Eukaryota:
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  55035 (  5%)           (  2%)
                    Other Mammalia        119094 ( 10%)           (  4%)
                    Other Vertebrata      157328 ( 13%)           (  5%)
                    Viridiplantae         259197 ( 22%)           (  8%)
                    Fungi                 187904 ( 16%)           (  6%)
                    Insecta               134424 ( 11%)           (  4%)
                    Nematoda               36759 (  3%)           (  1%)
                    Other                 235121 ( 20%)           (  7%)
                    
                    
                    
                    4.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50   42142             1001-1100    19892
                    51- 100  217858             1101-1200    14274
                    101- 150  273464             1201-1300    10158
                    151- 200  258617             1301-1400     6712
                    201- 250  259851             1401-1500     5505
                    251- 300  246860             1501-1600     3963
                    301- 350  231466             1601-1700     3126
                    351- 400  183758             1701-1800     2686
                    401- 450  148439             1801-1900     1991
                    451- 500  127714             1901-2000     1673
                    501- 550   93656             2001-2100     1295
                    551- 600   68771             2101-2200     1321
                    601- 650   51687             2201-2300     1094
                    651- 700   40198             2301-2400      887
                    701- 750   35649             2401-2500      672
                    751- 800   31903             >2500         6094
                    801- 850   23601
                    851- 900   20923
                    901- 950   15229
                    951-1000   11919
                    
                    
                    
                    The average sequence length in UniProtKB/TrEMBL is   323 amino acids.
                    
                    The shortest sequence is Q96AT0_HUMAN:     4 amino acids.
                    The longest sequence is  Q3ASY8_CHLCH: 36805 amino acids.
                    
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/TrEMBL lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                    4959113              1.50
                    Submitted to EMBL/GenBank/DDBJ  2564765   1763042    0.77
                    Journal                         2340853   1911015    0.71
                    Thesis                             5927      5875   <0.01
                    Book citation                      4222      4177   <0.01
                    Submitted to other databases        390       382   <0.01
                    Other                             42956     27184    0.01
                    
                    Comments (CC)                      1594288              0.48
                    CAUTION                          660733    660733    0.20
                    SIMILARITY                       339625    332287    0.10
                    SUBCELLULAR LOCATION             146413    146413    0.04
                    FUNCTION                         143112    137402    0.04
                    CATALYTIC ACTIVITY               111155    106722    0.03
                    SUBUNIT                           81452     81452    0.02
                    COFACTOR                          69193     68827    0.02
                    PATHWAY                           28469     24415    0.01
                    DOMAIN                             7826      7061   <0.01
                    MISCELLANEOUS                      3690      3690   <0.01
                    INTERACTION                        2586      2586   <0.01
                    MASS SPECTROMETRY                    28        20   <0.01
                    ALLERGEN                              6         6   <0.01
                    
                    Features (FT)                      1584760              0.48
                    NON_TER                         1415744    846140    0.43
                    SIGNAL                           117681    113596    0.04
                    CHAIN                             50795     29813    0.02
                    TRANSIT                             540       536   <0.01
                    
                    Cross-references (DR)             26048671              7.86
                    GO                              6905142   1966760    2.08
                    InterPro                        4978587   2263523    1.50
                    EMBL                            3795886   3304821    1.15
                    Pfam                            2836849   2111526    0.86
                    PROSITE                         1568909   1014438    0.47
                    KEGG                             886563    848900    0.27
                    GenomeReviews                    847386    805667    0.26
                    PRINTS                           640221    533328    0.19
                    SMART                            543869    423745    0.16
                    TIGRFAMs                         404484    373646    0.12
                    SMR                              383447    383385    0.12
                    ProDom                           370442    352631    0.11
                    BioCyc                           286378    271096    0.09
                    HSSP                             275921    275518    0.08
                    PANTHER                          249322    246987    0.08
                    PIR                              190563    155148    0.06
                    TIGR                             136495    130204    0.04
                    UniGene                          111140    106824    0.03
                    Ensembl                           99717     99715    0.03
                    ArrayExpress                      91421     91404    0.03
                    RZPD-ProtExp                      81191     32808    0.02
                    PIRSF                             80345     79566    0.02
                    Gramene                           71161     71161    0.02
                    MGI                               44511     43786    0.01
                    FlyBase                           25700     25663    0.01
                    TAIR                              19951     19890    0.01
                    WormPep                           19324     19239    0.01
                    WormBase                          19271     19188    0.01
                    LinkHub                           14660     14660   <0.01
                    MEROPS                            12421     11979   <0.01
                    ZFIN                              12302     12300   <0.01
                    LegioList                          5403      5373   <0.01
                    IntAct                             5209      5209   <0.01
                    ListiList                          4744      4727   <0.01
                    AGD4141      4141   <0.01
                    PDB4137      2465   <0.01
                    PhotoList                          4112      3988   <0.01
                    HGNC                               3152      3152   <0.01
                    TubercuList                        2551      2545   <0.01
                    DictyBase                          1967      1967   <0.01
                    RGD1902      1896   <0.01
                    GeneDB_Spombe                      1872      1859   <0.01
                    SagaList                           1762      1668   <0.01
                    Leproma                             974       973   <0.01
                    TRANSFAC                            897       886   <0.01
                    SGD688       671   <0.01
                    PeroxiBase                          633       627   <0.01
                    MypuList                            593       589   <0.01
                    REBASE                              124       119   <0.01
                    PHCI-2DPAGE                         106       106   <0.01
                    ANU-2DPAGE                           64        64   <0.01
                    SWISS-2DPAGE                         48        48   <0.01
                    Reactome                              7         7   <0.01
                    PMMA-2DPAGE                           3         3   <0.01
                    Siena-2DPAGE                          2         2   <0.01
                    COMPLUYEAST-2DPAGE                    1         1   <0.01
                    
                    Number of explicitly cross-referenced databases: 78
                    
                    
                    6.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in UniProtKB/TrEMBL: 234955
                    
                    Total number of entries encoded on a Mitochondrion: 144724
                    Total number of entries encoded on a Plasmid: 55874
                    Total number of entries encoded on a Plastid: 3169
                    Total number of entries encoded on a Plastid; Apicoplast: 179
                    Total number of entries encoded on a Plastid; Chloroplast: 51775
                    Total number of entries encoded on a Plastid; Cyanelle: 7
                    Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 166
                    
                    Number of fragments: 848216
                    
                

Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProtKB and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail: datasubs@ebi.ac.uk


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by a file containing the sequences of all additional alternative isoforms annotated in UniProtKB/Swiss-Prot. This data set is documented in the file ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 3 times per year) in flatfile format. Previous UniProtKB/Swiss-Prot and UniProtKB/TrEMBL are archived under ftp://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases. The UniProt Knowledgebase major release is also available on CD-ROM from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: datalib@ebi.ac.uk / swissprot@ebi.ac.uk
WWW server: http://www.ebi.ac.uk/


SIB Swiss Institute of Bioinformatics
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 379 50 50
Fax: (+41 22) 379 58 58
Electronic mail address: Swiss-Prot@expasy.org
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3300 Whitehaven St., Suite 1200
Washington, DC 20008
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address: pirmail@georgetown.edu
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

Wu C.H., Apweiler R., Bairoch A., Natale D.A., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Mazumder R., O'Donovan C., Redaschi N., Suzek B. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34: D187-D191 (2006).