Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Release 7.0 of the UniProt Knowledgebase is composed of the UniProtKB/Swiss-Prot Protein Knowledgebase release 49.0 and the UniProtKB/TrEMBL Protein Database release 32.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase?.


UniProtKB/Swiss-Prot protein knowledgebase release 49.0 statistics

Release 49.0 of 07-Feb-2006 of Swiss-Prot contains 207'132 sequence entries, comprising 75'438'310 amino acids abstracted from 139'151 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672
48.0 09/05 194'317 70'391'852
49.0 02/06 207'132 75'438'310

In rare cases, Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProtKB/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

  • be as complete as possible. All sequences available at a given time should be immediately included in UniProtKB/Swiss-Prot. This also includes sequence corrections and updates;
  • provide a higher level of annotation;
  • provide cross-references to specialized database(s) that contain, among other data, some information about the genes that code for these proteins;
  • provide specific indexes and documents.

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana TAIR arath.txt 3'957
C.albicans None yet calbican.txt 479
C.elegans Wormpep celegans.txt 2'784
D.discoideum DictyBase dicty.txt 325
D.melanogaster FlyBase fly.txt 2'338
M.musculus MGD mgdtosp.txt 10'523
S.cerevisiae SGD yeast.txt 5'271
S.pombe GeneDB_SPombe pombe.txt 2'945

UniProtKB/Swiss-Prot release statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 49.0 of 07-Feb-2006 of UniProtKB/Swiss-Prot contains 207'132 sequence entries,
                    comprising 75'438'310 amino acids abstracted from 139'151 references.
                    
                    12'815 sequences have been added since release 48, the sequence data of 991 existing 
                    entries has been updated and the annotations of all entries have been revised. 
                    This represents an increase of 7%.
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 7.83   Gln (Q) 3.95   Leu (L) 9.64   Ser (S) 6.86
                    Arg (R) 5.35   Glu (E) 6.64   Lys (K) 5.93   Thr (T) 5.42
                    Asn (N) 4.18   Gly (G) 6.93   Met (M) 2.38   Trp (W) 1.15
                    Asp (D) 5.32   His (H) 2.29   Phe (F) 4.00   Tyr (Y) 3.06
                    Cys (C) 1.52   Ile (I) 5.91   Pro (P) 4.83   Val (V) 6.71
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Thr, Arg, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 9731
                    
                    The first twenty species represent 69270 sequences:  33.4 % of the total
                    number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 4676
                    2x: 1525
                    3x:  748
                    4x:  488
                    5x:  319
                    6x:  287
                    7x:  197
                    8x:  159
                    9x:  140
                    10x:   78
                    11- 20x:  404
                    21- 50x:  308
                    51-100x:  110
                    >100x:  292
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      13433  Homo sapiens (Human)
                    2      10523  Mus musculus (Mouse)
                    3       5271  Saccharomyces cerevisiae (Baker's yeast)
                    4       4865  Rattus norvegicus (Rat)
                    5       4849  Escherichia coli
                    6       3957  Arabidopsis thaliana (Mouse-ear cress)
                    7       2945  Schizosaccharomyces pombe (Fission yeast)
                    8       2824  Bacillus subtilis
                    9       2784  Caenorhabditis elegans
                    10       2338  Drosophila melanogaster (Fruit fly)
                    11       1796  Escherichia coli O157:H7
                    12       1789  Bos taurus (Bovine)
                    13       1782  Methanococcus jannaschii
                    14       1772  Haemophilus influenzae
                    15       1549  Salmonella typhimurium
                    16       1476  Escherichia coli O6
                    17       1444  Shigella flexneri
                    18       1405  Mycobacterium tuberculosis
                    19       1323  Gallus gallus (Chicken)
                    20       1145  Mycobacterium bovis
                    21       1141  Salmonella typhi
                    22       1121  Xenopus laevis (African clawed frog)
                    23       1057  Pseudomonas aeruginosa
                    24       1022  Sus scrofa (Pig)
                    25        967  Archaeoglobus fulgidus
                    26        966  Synechocystis sp. (strain PCC 6803)
                    27        929  Pongo pygmaeus (Orangutan)
                    28        846  Vibrio cholerae
                    29        844  Yersinia pestis
                    30        836  Rhizobium meliloti (Sinorhizobium meliloti)
                    31        784  Oryctolagus cuniculus (Rabbit)
                    32        748  Aquifex aeolicus
                    33        724  Oryza sativa (Rice)
                    34        711  Pasteurella multocida
                    35        687  Vibrio parahaemolyticus
                    36        687  Mycoplasma pneumoniae
                    37        657  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        654  Staphylococcus aureus (strain N315)
                    39        650  Streptomyces coelicolor
                    40        643  Bacillus halodurans
                    41        639  Staphylococcus aureus (strain MW2)
                    42        636  Staphylococcus aureus (strain COL)
                    43        634  Staphylococcus aureus (strain MSSA476)
                    44        633  Staphylococcus aureus (strain MRSA252)
                    45        633  Vibrio vulnificus
                    46        627  Canis familiaris (Dog)
                    47        624  Mycobacterium leprae
                    48        619  Brachydanio rerio (Zebrafish) (Danio rerio)
                    49        613  Vibrio vulnificus (strain YJ016)
                    50        608  Treponema pallidum
                    51        596  Anabaena sp. (strain PCC 7120)
                    52        585  Methanobacterium thermoautotrophicum
                    53        572  Buchnera aphidicola subsp. Acyrthosiphon pisum
                    54        565  Pseudomonas putida (strain KT2440)
                    55        565  Helicobacter pylori (Campylobacter pylori)
                    56        562  Buchnera aphidicola subsp. Schizaphis graminum
                    57        560  Pseudomonas syringae pv. tomato
                    58        550  Bacillus anthracis
                    59        548  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    60        547  Rickettsia prowazekii
                    61        547  Staphylococcus epidermidis (strain ATCC 12228)
                    62        546  Helicobacter pylori J99 (Campylobacter pylori J99)
                    63        542  Bradyrhizobium japonicum
                    64        536  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    65        529  Ralstonia solanacearum (Pseudomonas solanacearum)
                    66        526  Zea mays (Maize)
                    67        526  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    68        525  Listeria monocytogenes
                    69        525  Photorhabdus luminescens subsp. laumondii
                    70        519  Listeria innocua
                    71        513  Rhizobium loti (Mesorhizobium loti)
                    72        508  Xanthomonas campestris pv. campestris
                    73        507  Buchnera aphidicola subsp. Baizongia pistaciae
                    74        505  Neisseria meningitidis serogroup B
                    75        502  Neisseria meningitidis serogroup A
                    76        495  Clostridium acetobutylicum
                    77        493  Shewanella oneidensis
                    78        492  Pan troglodytes (Chimpanzee)
                    79        490  Neurospora crassa
                    80        486  Mycoplasma genitalium
                    81        486  Caulobacter crescentus
                    82        479  Candida albicans (Yeast)
                    83        477  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    84        473  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
                    85        470  Thermotoga maritima
                    86        470  Xanthomonas axonopodis pv. citri
                    87        464  Streptococcus pneumoniae
                    88        458  Xylella fastidiosa
                    89        455  Yersinia pseudotuberculosis
                    90        455  Listeria monocytogenes serotype 4b (strain F2365)
                    91        449  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    92        446  Deinococcus radiodurans
                    93        440  Mimivirus
                    94        440  Pyrococcus horikoshii
                    95        440  Haemophilus ducreyi
                    96        436  Brucella melitensis
                    97        436  Methanosarcina acetivorans
                    98        435  Oceanobacillus iheyensis
                    99        435  Pyrococcus abyssi
                    100        435  Brucella suis
                    101        433  Corynebacterium glutamicum (Brevibacterium flavum)
                    102        433  Clostridium perfringens
                    103        432  Chlamydia trachomatis
                    104        432  Halobacterium salinarium (Halobacterium halobium)
                    105        427  Kluyveromyces lactis (Yeast)
                    106        419  Borrelia burgdorferi (Lyme disease spirochete)
                    107        416  Methanosarcina mazei (Methanosarcina frisia)
                    108        415  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    109        413  Chlamydia pneumoniae (Chlamydophila pneumoniae)
                    110        411  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    111        408  Nicotiana tabacum (Common tobacco)
                    112        408  Pyrococcus furiosus
                    113        404  Rhizobium sp. (strain NGR234)
                    114        403  Chlamydia muridarum
                    115        400  Thermoanaerobacter tengcongensis
                    116        396  Lactobacillus plantarum
                    117        391  Campylobacter jejuni
                    118        389  Ovis aries (Sheep)
                    119        389  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    120        387  Sulfolobus solfataricus
                    121        386  Streptococcus mutans
                    122        384  Synechococcus elongatus (Thermosynechococcus elongatus)
                    123        384  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    124        379  Chromobacterium violaceum
                    125        379  Streptococcus pyogenes serotype M1
                    126        377  Streptococcus pyogenes serotype M6
                    127        377  Bordetella pertussis
                    128        376  Bordetella parapertussis
                    129        376  Enterococcus faecalis (Streptococcus faecalis)
                    130        374  Streptococcus pyogenes serotype M18
                    131        373  Rickettsia conorii
                    132        373  Streptococcus pyogenes serotype M3
                    133        370  Candida glabrata (Yeast) (Torulopsis glabrata)
                    134        369  Staphylococcus aureus
                    135        368  Streptomyces avermitilis
                    136        353  Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
                    137        352  Chlorobium tepidum
                    138        342  Aeropyrum pernix
                    139        341  Corynebacterium efficiens
                    140        340  Bacillus cereus (strain ATCC 10987)
                    141        338  Methanopyrus kandleri
                    142        337  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    143        336  Leptospira interrogans
                    144        328  Nitrosomonas europaea
                    145        326  Leptospira interrogans serogroup Icterohaemorrhagiae serovar copenhageni
                    146        325  Dictyostelium discoideum (Slime mold)
                    147        323  Salmonella paratyphi-a
                    148        319  Emericella nidulans (Aspergillus nidulans)
                    149        317  Sulfolobus tokodaii
                    150        316  Pisum sativum (Garden pea)
                    151        313  Streptococcus agalactiae serotype III
                    152        310  Streptococcus agalactiae serotype V
                    153        305  Gloeobacter violaceus
                    154        303  Thermoplasma acidophilum
                    155        302  Lycopersicon esculentum (Tomato)
                    156        295  Yarrowia lipolytica (Candida lipolytica)
                    157        293  Triticum aestivum (Wheat)
                    158        292  Synechococcus sp. (strain WH8102)
                    159        289  Fusobacterium nucleatum subsp. nucleatum
                    160        287  Prochlorococcus marinus (strain MIT 9313)
                    161        287  Rhodopseudomonas palustris
                    162        284  Prochlorococcus marinus
                    163        283  Bacillus thuringiensis subsp. konkukian
                    164        281  Macaca mulatta (Rhesus macaque)
                    165        281  Acinetobacter sp. (strain ADP1)
                    166        280  Pseudomonas putida
                    167        278  Sulfolobus acidocaldarius
                    168        276  Hordeum vulgare (Barley)
                    169        274  Coxiella burnetii
                    170        271  Cavia porcellus (Guinea pig)
                    171        269  Pyrobaculum aerophilum
                    172        269  Glycine max (Soybean)
                    173        268  Bacteriophage T4
                    174        268  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
                    175        267  Thermoplasma volcanium
                    176        265  Clostridium tetani
                    177        261  Solanum tuberosum (Potato)
                    178        259  Bacteroides thetaiotaomicron
                    179        258  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    180        258  Rhodopirellula baltica
                    181        257  Mycobacterium paratuberculosis
                    182        254  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
                    183        254  Vaccinia virus (strain Copenhagen) (VACV)
                    184        254  Wolinella succinogenes
                    185        249  Bacillus clausii (strain KSM-K16)
                    186        248  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
                    187        246  Spinacia oleracea (Spinach)
                    188        244  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    189        244  Bacillus cereus (strain ZK / E33L)
                    190        243  Mannheimia succiniciproducens (strain MBEL55E)
                    191        242  Wigglesworthia glossinidia brevipalpis
                    192        240  Thermus thermophilus (strain HB8 / ATCC 27634 / DSM 579)
                    193        238  Geobacter sulfurreducens
                    194        237  Bifidobacterium longum
                    195        235  Bacillus stearothermophilus
                    196        234  Corynebacterium diphtheriae
                    197        232  Equus caballus (Horse)
                    198        231  Chlamydophila caviae
                    199        229  Porphyromonas gingivalis (Bacteroides gingivalis)
                    200        225  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
                    201        224  Burkholderia mallei (Pseudomonas mallei)
                    202        224  Helicobacter hepaticus
                    203        224  Methanococcus maripaludis
                    204        221  Methylococcus capsulatus
                    205        220  Porphyra purpurea
                    206        219  Thermus thermophilus (strain HB27 / ATCC BAA-163 / DSM 7039)
                    207        217  Haloarcula marismortui (Halobacterium marismortui)
                    208        216  Chlamydomonas reinhardtii
                    209        212  Zymomonas mobilis
                    210        212  Synechococcus sp. (strain PCC 6301) (Anacystis nidulans)
                    211        209  Klebsiella pneumoniae
                    212        209  Leifsonia xyli subsp. xyli
                    213        205  Blochmannia floridanus
                    214        204  Geobacillus kaustophilus
                    215        203  Nocardia farcinica
                    216        200  Vaccinia virus (strain Western Reserve / WR) (VACV)
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           10124 (  5%)
                    Bacteria          96390 ( 47%)
                    Eukaryota         90758 ( 44%)
                    Viruses            9860 (  5%)
                    
                    
                    Within Eukaryota:
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  13434 ( 15%)           (  6%)
                    Other Mammalia         27101 ( 30%)           ( 13%)
                    Other Vertebrata        8051 (  9%)           (  4%)
                    Viridiplantae          14694 ( 16%)           (  7%)
                    Fungi                  13810 ( 15%)           (  7%)
                    Insecta                 4492 (  5%)           (  2%)
                    Nematoda                3133 (  3%)           (  2%)
                    Other                   6043 (  7%)           (  3%)
                    
                    
                    4.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    4139             1001-1100     1796
                    51- 100   14793             1101-1200     1183
                    101- 150   21093             1201-1300      898
                    151- 200   20181             1301-1400      675
                    201- 250   20675             1401-1500      557
                    251- 300   17760             1501-1600      335
                    301- 350   18308             1601-1700      241
                    351- 400   16671             1701-1800      198
                    401- 450   13067             1801-1900      196
                    451- 500   10877             1901-2000      162
                    501- 550    8282             2001-2100      104
                    551- 600    5613             2101-2200      153
                    601- 650    4693             2201-2300      132
                    651- 700    3352             2301-2400       89
                    701- 750    2810             2401-2500       70
                    751- 800    2326             >2500          521
                    801- 850    1932
                    851- 900    2052
                    901- 950    1565
                    951-1000    1200
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 364 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  SYNE1_HUMAN (Q8NF91):  8797 amino acids.
                    
                    
                    5.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1662
                    
                    
                    5.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  586
                    2x:  225
                    3x:  128
                    4x:   81
                    5x:   56
                    6x:   39
                    7x:   31
                    8x:   41
                    9x:   18
                    10x:   16
                    11- 20x:  116
                    21- 50x:  145
                    51-100x:   59
                    >100x:  121
                    
                    
                    5.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        13103   Journal of Biological Chemistry
                    2         6437   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         4277   Journal of Bacteriology
                    4         3988   Gene
                    5         3925   Nucleic Acids Research
                    6         3442   Biochemical and Biophysical Research Communications
                    7         3316   FEBS Letters
                    8         3026   Biochemistry
                    9         2866   The EMBO Journal
                    10         2842   European Journal of Biochemistry
                    11         2599   Nature
                    12         2504   Biochimica et Biophysica Acta
                    13         2294   Journal of Molecular Biology
                    14         2257   Molecular and Cellular Biology
                    15         2158   Genomics
                    16         2069   Cell
                    17         1667   Biochemical Journal
                    18         1560   Science
                    19         1379   Molecular Microbiology
                    20         1277   Plant Molecular Biology
                    21         1251   Molecular and General Genetics
                    22         1080   Journal of Cell Biology
                    23         1034   Journal of Biochemistry
                    24         1006   Virology
                    25          989   Human Molecular Genetics
                    26          956   Journal of Virology
                    27          946   Nature Genetics
                    28          899   Genes and Development
                    29          828   Plant Physiology
                    30          815   Oncogene
                    31          807   The American Journal of Human Genetics
                    32          747   Human Mutation
                    33          699   Journal of Immunology
                    34          693   Infection and Immunity
                    35          664   Structure
                    36          663   Development
                    37          652   Archives of Biochemistry and Biophysics
                    38          641   Yeast
                    39          616   Journal of General Virology
                    40          608   Genetics
                    41          573   Microbiology
                    42          530   FEMS Microbiology Letters
                    43          520   Nature Structural Biology
                    44          490   Blood
                    45          465   Human Genetics
                    46          462   The Plant Cell
                    47          456   Current Genetics
                    48          455   Molecular Biology of the Cell
                    49          408   Applied and Environmental Microbiology
                    50          404   Cancer Research
                    51          403   Developmental Biology
                    52          395   Journal of Clinical Investigation
                    53          393   Molecular and Biochemical Parasitology
                    54          391   Journal of Cell Science
                    55          381   Mammalian Genome
                    56          379   Protein Science
                    57          378   Mechanisms of Development
                    58          375   Neuron
                    59          375   The Plant Journal
                    60          367   Molecular Endocrinology
                    61          362   Acta Crystallographica, Section D
                    62          358   Molecular Cell
                    63          354   The Journal of Experimental Medicine
                    64          346   Immunogenetics
                    65          340   Journal of Neuroscience
                    66          331   Journal of Molecular Evolution
                    67          321   Endocrinology
                    68          320   DNA and Cell Biology
                    69          304   Current Biology
                    70          294   Journal of Neurochemistry
                    71          286   DNA Sequence
                    72          283   Biological Chemistry Hoppe-Seyler
                    73          270   American Journal of Physiology
                    74          267   Molecular Biology and Evolution
                    75          266   The Journal of Clinical Endocrinology and Metabolism
                    76          260   Bioscience, Biotechnology, and Biochemistry
                    77          258   Brain Research. Molecular Brain Research
                    78          243   Toxicon
                    79          241   Journal of General Microbiology
                    80          238   Cytogenetics and Cell Genetics
                    81          221   Comparative Biochemistry and Physiology
                    82          214   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    83          207   Antimicrobial Agents and Chemotherapy
                    84          201   Proteins
                    85          196   Molecular Pharmacology
                    86          186   Journal of Investigative Dermatology
                    87          186   Journal of Medical Genetics
                    88          170   DNA Research
                    89          170   Peptides
                    90          166   Plant and Cell Physiology
                    91          162   Molecular Plant-Microbe Interactions
                    92          162   Virus Research
                    93          161   Genome Research
                    94          159   Biology of Reproduction
                    95          158   DNA
                    96          152   Tissue Antigens
                    97          151   European Journal of Immunology
                    98          146   Biochimie
                    99          141   Molecular and Cellular Endocrinology
                    100          139   American Journal of Medical Genetics
                    101          138   Bioorganicheskaia Khimiia
                    102          135   Hemoglobin
                    103          128   Experimental Cell Research
                    104          127   Nature Cell Biology
                    105          126   Archives of Microbiology
                    106          124   Annals of Neurology
                    107          124   Molecular Phylogenetics and Evolution
                    108          121   Neurology
                    109          120   Insect Biochemistry and Molecular Biology
                    110          118   Agricultural and Biological Chemistry
                    111          117   European Journal of Human Genetics
                    112          113   Journal of Human Genetics
                    113          113   Immunity
                    114          113   RNA
                    115          112   General and Comparative Endocrinology
                    116          111   Developmental Dynamics
                    117          106   Diabetes
                    118          103   Molecular Reproduction and Development
                    119          103   Molecular Immunology
                    120          103   Planta
                    121          102   Genes to Cells
                    122          100   Journal of Protein Chemistry
                    
                    
                    6.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                     407878              1.97
                    Journal                          361072    192976    1.74
                    Submitted to EMBL/GenBank/DDBJ    43559     37255    0.21
                    Submitted to Swiss-Prot             726       723   <0.01
                    Unpublished observations            567       563   <0.01
                    Book citation                       547       535   <0.01
                    Plant Gene Register                 519       507   <0.01
                    Submitted to other databases        388       380   <0.01
                    Thesis                              341       339   <0.01
                    Patent                              131       129   <0.01
                    Unpublished results                  22        22   <0.01
                    Worm Breeder's Gazette                6         6   <0.01
                    
                    Comments (CC)                       801430              3.87
                    SIMILARITY                       225736    186967    1.09
                    FUNCTION                         140862    137032    0.68
                    SUBCELLULAR LOCATION             106708    106708    0.52
                    CATALYTIC ACTIVITY                75237     69754    0.36
                    SUBUNIT                           71489     71489    0.35
                    PATHWAY                           39782     34536    0.19
                    COFACTOR                          30534     27405    0.15
                    TISSUE SPECIFICITY                21716     21716    0.10
                    MISCELLANEOUS                     17969     16342    0.09
                    PTM                               15424     13217    0.07
                    DOMAIN                            10600      9263    0.05
                    ALTERNATIVE PRODUCTS               8509      8509    0.04
                    CAUTION                            7869      6989    0.04
                    INDUCTION                          5801      5801    0.03
                    DEVELOPMENTAL STAGE                5218      5218    0.03
                    INTERACTION                        4956      4956    0.02
                    DISEASE                            3190      2329    0.02
                    ENZYME REGULATION                  3089      3089    0.01
                    MASS SPECTROMETRY                  2070      1747    0.01
                    DATABASE                           1562      1406    0.01
                    BIOPHYSICOCHEMICAL PROPERTIES      1303      1303    0.01
                    POLYMORPHISM                        531       519   <0.01
                    ALLERGEN                            406       406   <0.01
                    RNA EDITING                         403       403   <0.01
                    TOXIC DOSE                          280       278   <0.01
                    BIOTECHNOLOGY                       125       125   <0.01
                    PHARMACEUTICAL                       61        61   <0.01
                    
                    Features (FT)                      1502318              7.25
                    CHAIN                            210458    203970    1.02
                    STRAND                           147090      7249    0.71
                    TRANSMEM                         135865     29419    0.66
                    TURN                              95995      7364    0.46
                    METAL                             88482     21513    0.43
                    CONFLICT                          75912     26372    0.37
                    TOPO_DOM                          70026     14125    0.34
                    HELIX                             67093      7146    0.32
                    CARBOHYD                          64943     16393    0.31
                    DISULFID                          64555     16785    0.31
                    DOMAIN                            62427     33891    0.30
                    ACT_SITE                          48527     28416    0.23
                    REPEAT                            45273      6546    0.22
                    VARIANT                           37244      7394    0.18
                    BINDING                           30308     14568    0.15
                    MOD_RES                           29797     14761    0.14
                    NP_BIND                           28495     20106    0.14
                    REGION                            27780     14613    0.13
                    SIGNAL                            20949     20947    0.10
                    COMPBIAS                          20505     11351    0.10
                    VARSPLIC                          17662      7652    0.09
                    MUTAGEN                           14751      3693    0.07
                    ZN_FING                           14118      5500    0.07
                    MOTIF                             12527      8696    0.06
                    SITE                              10950      6128    0.05
                    NON_TER                           10844      8278    0.05
                    INIT_MET                           9461      9384    0.05
                    PROPEP                             6855      5725    0.03
                    COILED                             6420      3939    0.03
                    DNA_BIND                           6094      5691    0.03
                    LIPID                              6044      3972    0.03
                    PEPTIDE                            5845      3574    0.03
                    TRANSIT                            3637      3603    0.02
                    CA_BIND                            2417       978    0.01
                    CROSSLNK                           1210       942    0.01
                    NON_CONS                           1120       523    0.01
                    UNSURE                              418       170   <0.01
                    SE_CYS                              221       155   <0.01
                    
                    Cross-references (DR)              2236664             10.80
                    InterPro                         423342    189133    2.04
                    EMBL                             395184    199200    1.91
                    Pfam                             248875    182310    1.20
                    PROSITE                          188497    116608    0.91
                    GO97279     27620    0.47
                    PIR                               94760     88562    0.46
                    PRINTS                            78657     61371    0.38
                    TIGRFAMs                          76795     71754    0.37
                    HSSP                              76069     76069    0.37
                    HAMAP                             71745     71631    0.35
                    BioCyc                            67849     62817    0.33
                    SMART                             58049     44253    0.28
                    ProDom                            54565     52510    0.26
                    PANTHER                           48143     45588    0.23
                    Ensembl                           38163     38153    0.18
                    PDB                               30838      8497    0.15
                    SMR                               26812     26812    0.13
                    TIGR                              20204     19648    0.10
                    PIRSF                             17045     16795    0.08
                    LinkHub                           14271     14271    0.07
                    HGNC                              12793     12737    0.06
                    MIM                               11422      9364    0.06
                    MGI                               10357     10318    0.05
                    IntAct                             6588      6588    0.03
                    SGD5328      5263    0.03
                    GermOnline                         4926      4880    0.02
                    RGD4605      4602    0.02
                    EcoGene                            4225      4223    0.02
                    EchoBASE                           4159      4127    0.02
                    TAIR                               3998      3926    0.02
                    MEROPS                             3958      3837    0.02
                    H-InvDB                            3676      3658    0.02
                    WormPep                            3260      2782    0.02
                    GeneDB_Spombe                      2978      2943    0.01
                    FlyBase                            2967      2920    0.01
                    WormBase                           2859      2781    0.01
                    TRANSFAC                           2811      2522    0.01
                    SubtiList                          2766      2765    0.01
                    Gramene                            2092      2084    0.01
                    StyGene                            1505      1502    0.01
                    TubercuList                        1433      1397    0.01
                    GeneFarm                           1305      1299    0.01
                    SWISS-2DPAGE                       1166      1166    0.01
                    ListiList                          1045      1037    0.01
                    Reactome                            998       998   <0.01
                    Leproma                             627       624   <0.01
                    ZFIN613       606   <0.01
                    PhotoList                           525       525   <0.01
                    MaizeDB                             432       427   <0.01
                    AGD421       415   <0.01
                    HIV370       365   <0.01
                    OGP369       369   <0.01
                    REBASE                              352       348   <0.01
                    ECO2DBASE                           351       299   <0.01
                    LegioList                           334       334   <0.01
                    DictyBase                           326       324   <0.01
                    SagaList                            314       313   <0.01
                    GlycoSuiteDB                        282       282   <0.01
                    PHCI-2DPAGE                         239       239   <0.01
                    MypuList                            181       181   <0.01
                    Aarhus/Ghent-2DPAGE                 128        98   <0.01
                    Siena-2DPAGE                        103       103   <0.01
                    HSC-2DPAGE                           85        85   <0.01
                    PhosSite                             64        62   <0.01
                    COMPLUYEAST-2DPAGE                   59        59   <0.01
                    PMMA-2DPAGE                          52        52   <0.01
                    Rat-heart-2DPAGE                     28        28   <0.01
                    PptaseDB                             27        27   <0.01
                    ANU-2DPAGE                           20        20   <0.01
                    
                    Number of explicitly cross-referenced databases: 70
                    Number of implicitly cross-referenced databases: 29
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 216069
                    
                    Total number of entries encoded on a Mitochondrion: 3397
                    Total number of entries encoded on a Plasmid: 3073
                    Total number of entries encoded on a Plastid: 20
                    Total number of entries encoded on a Plastid; Apicoplast: 2
                    Total number of entries encoded on a Plastid; Chloroplast: 5174
                    Total number of entries encoded on a Plastid; Cyanelle: 145
                    Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 86
                    
                    Number of fragments: 8433
                    Number of additional sequences encoded on splice variants: 13333
                    
                    
                

UniProtKB/TrEMBL protein database release 32.0 statistics

                    
                    1.  INTRODUCTION
                    
                    Release 32.0 of 07-February-2006 of UniProtKB/TrEMBL has been produced in synch
                    with UniProtKB/Swiss-Prot release 49 and EMBL/DDBJ/GenBank nucleotide sequence
                    database release 85 and updates until the 30-January-2006. It contains
                    2'605'574 sequence entries comprising 838'379'783 amino acids.
                    
                    
                    In the document delac_tr.txt, you will find a list of all accession numbers
                    which were previously present in UniProtKB/TrEMBL, but which have now been
                    deleted from the database. Most deletions are due to the deletion of the
                    corresponding CDS in the source nucleotide sequence databases EMBL-
                    Bank/DDBJ/GenBank. In addition, some entries are recognised to be Open
                    Reading frames (ORFs) that have been wrongly predicted to code for proteins.
                    When there is enough evidence that these hypothetical proteins are not real,
                    we take the decision to remove them from UniProtKB/TrEMBL. 
                    
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 8.20   Gln (Q) 3.87   Leu (L) 9.82   Ser (S) 6.97
                    Arg (R) 5.50   Glu (E) 6.06   Lys (K) 5.32   Thr (T) 5.67
                    Asn (N) 4.32   Gly (G) 6.99   Met (M) 2.39   Trp (W) 1.34
                    Asp (D) 5.18   His (H) 2.26   Phe (F) 4.06   Tyr (Y) 3.05
                    Cys (C) 1.42   Ile (I) 5.93   Pro (P) 4.91   Val (V) 6.58
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.05
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Ser, Val, Glu, Ile, Thr, Arg, Lys, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of
                    UniProtKB/TrEMBL: 103997
                    
                    The first twenty species represent 604592 sequences: 23.2 % of the
                    total number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x:49711
                    2x:19812
                    3x: 9920
                    4x: 5494
                    5x: 3115
                    6x: 2428
                    7x: 1675
                    8x: 1380
                    9x: 1119
                    10x:  923
                    11- 20x: 4231
                    21- 50x: 2134
                    51-100x:  860
                    >100x: 1195
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1     146096  Human immunodeficiency virus 1
                    2      57551  Homo sapiens (Human)
                    3      57096  Oryza sativa (japonica cultivar-group)
                    4      51339  Mus musculus (Mouse)
                    5      42093  Arabidopsis thaliana (Mouse-ear cress)
                    6      28014  Tetraodon nigroviridis (Green puffer)
                    7      26030  Hepatitis C virus
                    8      25255  Drosophila melanogaster (Fruit fly)
                    9      20339  Caenorhabditis elegans
                    10      20120  Trypanosoma cruzi
                    11      15136  Anopheles gambiae str. PEST
                    12      14669  Plasmodium chabaudi
                    13      14617  Dictyostelium discoideum (Slime mold)
                    14      13851  Brachydanio rerio (Zebrafish) (Danio rerio)
                    15      13144  Caenorhabditis briggsae
                    16      12337  Xenopus laevis (African clawed frog)
                    17      12181  Aspergillus oryzae
                    18      11767  Plasmodium berghei
                    19      11748  Gibberella zeae (Fusarium graminearum)
                    20      11209  uncultured bacterium
                    21      10803  Neurospora crassa
                    22      10435  Hepatitis B virus (HBV)
                    23      10158  Aspergillus fumigatus (Sartorya fumigata)
                    24       9889  Rattus norvegicus (Rat)
                    25       9739  Trypanosoma brucei
                    26       9693  Schistosoma japonicum (Blood fluke)
                    27       9405  Aspergillus nidulans FGSC A4
                    28       9090  Entamoeba histolytica HM-1:IMSS
                    29       9050  Candida albicans SC5314
                    30       8102  Bradyrhizobium japonicum
                    31       8063  Solibacter usitatus Ellin6076
                    32       7937  Frankia sp. EAN1pec
                    33       7800  Plasmodium yoelii yoelii
                    34       7740  Escherichia coli
                    35       7715  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    36       7663  Burkholderia vietnamiensis G4
                    37       7559  Streptomyces coelicolor
                    38       7432  Bradyrhizobium sp. BTAi1
                    39       7341  Streptomyces avermitilis
                    40       7165  Rhizobium loti (Mesorhizobium loti)
                    41       7085  Leishmania major
                    42       7049  Burkholderia cenocepacia HI2424
                    43       7013  Rhodopirellula baltica
                    44       6979  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    45       6752  Hahella chejuensis KCTC 2396
                    46       6567  Pseudomonas aeruginosa
                    47       6562  Bos taurus (Bovine)
                    48       6526  Burkholderia ambifaria AMMD
                    49       6505  Cryptococcus neoformans (Filobasidiella neoformans)
                    50       6475  Cryptococcus neoformans var. neoformans B-3501A
                    51       6456  Burkholderia cenocepacia AU 1054
                    52       6451  Ustilago maydis 521
                    53       6408  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    54       6394  Giardia lamblia ATCC 50803
                    55       6329  Burkholderia pseudomallei (strain 1710b)
                    56       6316  Ralstonia metallidurans (strain CH34)
                    57       6310  Yarrowia lipolytica (Candida lipolytica)
                    58       6228  Bacillus anthracis
                    59       6129  Bacillus thuringiensis serovar israelensis ATCC 35646
                    60       6084  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    61       6079  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    62       5905  Bacillus cereus G9241
                    63       5737  Nocardia farcinica
                    64       5728  Pseudomonas fluorescens (strain PfO-1)
                    65       5707  Rhizobium meliloti (Sinorhizobium meliloti)
                    66       5686  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    67       5661  Crocosphaera watsonii
                    68       5646  Polaromonas sp. JS666
                    69       5638  Anabaena variabilis (strain ATCC 29413)
                    70       5593  Gallus gallus (Chicken)
                    71       5561  Burkholderia thailandensis E264
                    72       5550  Anabaena sp. (strain PCC 7120)
                    73       5494  Bacillus cereus (strain ATCC 10987)
                    74       5394  Bacillus cereus (strain ZK / E33L)
                    75       5312  Chimpanzee immunodeficiency virus (SIV-cpz) 
                    76       5288  Helicobacter pylori (Campylobacter pylori)
                    77       5245  Pseudomonas putida F1
                    78       5234  Plasmodium falciparum
                    79       5223  Plasmodium falciparum (isolate 3D7)
                    80       5153  Yersinia pestis
                    81       5084  Paracoccus denitrificans PD1222
                    82       5053  Clostridium beijerincki NCIMB 8052
                    83       5050  Streptococcus pneumoniae
                    84       5019  Pseudomonas syringae pv. syringae (strain B728a)
                    85       5018  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    86       5009  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    87       5005  Kluyveromyces lactis (Yeast)
                    88       4971  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    89       4955  Pseudomonas syringae pv. tomato
                    90       4938  Azotobacter vinelandii AvOP
                    91       4935  Rhodopseudomonas palustris BisB18
                    92       4929  Candida glabrata (Yeast) (Torulopsis glabrata)
                    93       4911  Nocardioides sp. JS614
                    94       4896  Rhodopseudomonas palustris BisA53
                    95       4827  Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
                    96       4818  Escherichia coli O157:H7
                    97       4809  Bacillus thuringiensis subsp. konkukian
                    98       4769  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    99       4751  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    100       4748  Pseudomonas putida (strain KT2440)
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    Kingdom        sequences (% of the database)
                    Archaea           63173 (  2%)
                    Bacteria        1193711 ( 46%)
                    Eukaryota        984105 ( 38%)
                    Viruses          362130 ( 14%)
                    Other              2455 ( <1%)
                    
                    Within Eukaryota:
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                      0 (  0%)           (  0%)
                    Other Mammalia        174802 ( 18%)           (  7%)
                    Other Vertebrata      137413 ( 14%)           (  5%)
                    Viridiplantae         204082 ( 21%)           (  8%)
                    Fungi                 137160 ( 14%)           (  5%)
                    Insecta                98720 ( 10%)           (  4%)
                    Nematoda               36538 (  4%)           (  1%)
                    Other                 195390 ( 20%)           (  7%)
                    
                    
                    4.  SEQUENCE SIZE
                    
                    4.1  Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50   30122             1001-1100    15745
                    51- 100  163087             1101-1200    11054
                    101- 150  210030             1201-1300     7947
                    151- 200  196359             1301-1400     5191
                    201- 250  198812             1401-1500     4323
                    251- 300  186329             1501-1600     2996
                    301- 350  177963             1601-1700     2360
                    351- 400  142131             1701-1800     1985
                    401- 450  113612             1801-1900     1498
                    451- 500   97281             1901-2000     1280
                    501- 550   73289             2001-2100      953
                    551- 600   52919             2101-2200     1068
                    601- 650   40519             2201-2300      877
                    651- 700   31396             2301-2400      695
                    701- 750   27159             2401-2500      514
                    751- 800   22979             >2500         4643
                    801- 850   18351
                    851- 900   16327
                    901- 950   12048
                    951-1000    9446
                    
                    
                    4.2  Longest and shortest sequences
                    
                    The shortest sequence is Q16047_HUMAN:     4 amino acids.
                    The longest sequence is  Q3ASY8_CHLCH: 36805 amino acids.
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/TrEMBL 
                    lines, as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                    3900912              1.50
                    Journal                         2025956   1662223    0.78
                    Submitted to EMBL/GenBank/DDBJ  1829984   1216539    0.70
                    Thesis                             4894      4841   <0.01
                    Book citation                      4117      4074   <0.01
                    Submitted to other databases        435       427   <0.01
                    Other                             35526     21675    0.01
                    
                    Comments (CC)                      1391074              0.53
                    CAUTION                          555822    555822    0.21
                    SIMILARITY                       287304    283060    0.11
                    SUBCELLULAR LOCATION             139002    139002    0.05
                    FUNCTION                         135438    134264    0.05
                    CATALYTIC ACTIVITY                98620     95511    0.04
                    SUBUNIT                           65771     65771    0.03
                    COFACTOR                          51801     51334    0.02
                    PATHWAY                           44306     42318    0.02
                    DOMAIN                             5498      3877   <0.01
                    MISCELLANEOUS                      3737      3727   <0.01
                    INTERACTION                        3643      3643   <0.01
                    MASS SPECTROMETRY                   116        61   <0.01
                    ALLERGEN                             16        16   <0.01
                    
                    Features (FT)                      1334625              0.51
                    NON_TER                         1204261    720162    0.46
                    SIGNAL                            84376     81398    0.03
                    CHAIN                             45421     27275    0.02
                    TRANSIT                             567       563   <0.01
                    
                    Cross-references (DR)             18719573              7.18
                    GO                              5741378   1631013    2.20
                    InterPro                        3343552   1700220    1.28
                    EMBL                            2991301   2596793    1.15
                    Pfam                            2100412   1572588    0.81
                    PROSITE                         1218339    783259    0.47
                    PRINTS                           519797    431928    0.20
                    SMART                            394067    312319    0.15
                    SMR                              294379    294364    0.11
                    BioCyc                           290893    275436    0.11
                    TIGRFAMs                         289333    268049    0.11
                    HSSP                             282910    282629    0.11
                    ProDom                           277265    266338    0.11
                    PANTHER                          246734    236085    0.09
                    PIR                              195329    159779    0.07
                    Ensembl                          111130    111128    0.04
                    TIGR                              96019     89923    0.04
                    Gramene                           57215     57183    0.02
                    PIRSF                             56960     56161    0.02
                    MGI                               46996     44473    0.02
                    FlyBase                           26906     26866    0.01
                    TAIR                              20427     20360    0.01
                    WormPep                           19117     19036    0.01
                    WormBase                          19116     19036    0.01
                    LinkHub                           15357     15357    0.01
                    ZFIN                              11986     11982   <0.01
                    MEROPS                             8168      7910   <0.01
                    IntAct                             5821      5821   <0.01
                    LegioList                          5569      5539   <0.01
                    ListiList                          4770      4753   <0.01
                    AGD4295      4295   <0.01
                    PhotoList                          4155      4031   <0.01
                    PDB3162      1872   <0.01
                    HGNC                               3063      3063   <0.01
                    TubercuList                        2555      2549   <0.01
                    RGD2144      2132   <0.01
                    GeneDB_Spombe                      1963      1957   <0.01
                    SagaList                           1780      1686   <0.01
                    SGD1327      1323   <0.01
                    Leproma                             980       979   <0.01
                    DictyBase                           979       979   <0.01
                    TRANSFAC                            954       942   <0.01
                    MypuList                            601       597   <0.01
                    REBASE                              124       119   <0.01
                    PHCI-2DPAGE                         108       108   <0.01
                    ANU-2DPAGE                           65        65   <0.01
                    SWISS-2DPAGE                         52        52   <0.01
                    Reactome                             14        14   <0.01
                    PMMA-2DPAGE                           3         3   <0.01
                    Siena-2DPAGE                          2         2   <0.01
                    COMPLUYEAST-2DPAGE                    1         1   <0.01
                    
                    Number of explicitly cross-referenced databases: 70
                    
                    
                    6.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in UniProtKB/TrEMBL: 222640
                    
                    Total number of entries encoded on a Mitochondrion: 124483
                    Total number of entries encoded on a Plasmid: 41026
                    Total number of entries encoded on a Plastid: 2319
                    Total number of entries encoded on a Plastid; Apicoplast: 125
                    Total number of entries encoded on a Plastid; Chloroplast: 45103
                    Total number of entries encoded on a Plastid; Cyanelle: 5
                    Total number of entries encoded on a Plastid; Non-photosynthetic plastid: 
                    
                    Number of fragments: 722286
                    
                

Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProtkb and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail: datasubs@ebi.ac.uk


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by a file containing the sequences of all additional splice isoforms annotated in UniProtKB/Swiss-Prot. This data set is documented in the file ftp://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 3 times per year) in flatfile format. Previous UniProtKB/Swiss-Prot and UniProtKB/TrEMBL are archived under ftp://ftp.uniprot.org/pub/databases/uniprot/previous_major_releases. The UniProt Knowledgebase major release is also available on CD-ROM from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: datalib@ebi.ac.uk / swissprot@ebi.ac.uk
WWW server: http://www.ebi.ac.uk/


SIB Swiss Institute of Bioinformatics
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 702 50 50
Fax: (+41 22) 702 58 58
Electronic mail address: Swiss-Prot@expasy.org
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3900 Reservoir Road, NW
Box 571455
Washington, DC 20057-1455
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address: pirmail@georgetown.edu
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

Wu C.H., Apweiler R., Bairoch A., Natale D.A., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Mazumder R., O'Donovan C., Redaschi N., Suzek B. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 34: D187-D191 (2006).