Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Release 5.0 of the UniProt Knowledgebase is composed of the UniProt/Swiss-Prot Protein Knowledgebase release 47.0 and the UniProt/TrEMBL Protein Database release 30.0.

More information on these databases can be found in the user manual What is the UniProt Knowledgebase?.


UniProt/Swiss-Prot protein knowledgebase release 47.0 statistics

Release 47.0 of 10-May-2005 of Swiss-Prot contains 181'577 sequence entries, comprising 65'746'672 amino acids abstracted from 128'440 references.

The growth of the database is summarized below.

Release Date Number of entries Number of amino acids
2.0 09/86 3'939 900'163
3.0 11/86 4'160 969'641
4.0 04/87 4'387 1'036'010
5.0 09/87 5'205 1'327'683
6.0 01/88 6'102 1'653'982
7.0 04/88 6'821 1'885'771
8.0 08/88 7'724 2'224'465
9.0 11/88 8'702 2'498'140
10.0 03/89 10'008 2'952'613
11.0 07/89 10'856 3'265'966
12.0 10/89 12'305 3'797'482
13.0 01/90 13'837 4'347'336
14.0 04/90 15'409 4'914'264
15.0 08/90 16'941 5'486'399
16.0 11/90 18'364 5'986'949
17.0 02/91 20'024 6'524'504
18.0 05/91 20'772 6'792'034
19.0 08/91 21'795 7'173'785
20.0 11/91 22'654 7'500'130
21.0 03/92 23'742 7'866'596
22.0 05/92 25'044 8'375'696
23.0 08/92 26'706 9'011'391
24.0 12/92 28'154 9'545'427
25.0 04/93 29'955 10'214'020
26.0 07/93 31'808 10'875'091
27.0 10/93 33'329 11'484'420
28.0 02/94 36'000 12'496'420
29.0 06/94 38'303 13'464'008
30.0 10/94 40'292 14'147'368
31.0 02/95 43'470 15'335'248
32.0 11/95 49'340 17'385'503
33.0 02/96 52'205 18'531'384
34.0 10/96 59'021 21'210'389
35.0 11/97 69'113 25'083'768
36.0 07/98 74'019 26'840'295
37.0 12/98 77'977 28'268'293
38.0 07/99 80'000 29'085'965
39.0 05/00 86'593 31'411'114
40.0 10/01 101'602 37'315'215
41.0 02/03 122'564 44'986'459
42.0 10/03 135'850 50'046'799
43.0 03/04 146'720 54'093'154
44.0 07/04 153'871 56'608'159
45.0 10/04 163'235 59'631'787
46.0 02/05 168'297 61'443'278
47.0 05/05 181'577 65'746'672

In rare cases, Swiss-Prot entries are removed. Deleted entries are almost exclusively Open Reading Frames (ORFs) that have been wrongly predicted to code for proteins. When there is enough evidence that these hypothetical proteins are not real we take the decision to remove them from Swiss-Prot. In the document delac_sp.txt, you will find a list of all accession numbers which were previously present in UniProt/Swiss-Prot, but which have now been deleted from the database.


Status of the model organisms

We have selected a number of organisms that are the target of genome sequencing and/or mapping projects and for which we intend to:

  • be as complete as possible. All sequences available at a given time should be immediately included in UniProt/Swiss-Prot. This also includes sequence corrections and updates;
  • provide a higher level of annotation;
  • provide cross-references to specialized database(s) that contain, among other data, some information about the genes that code for these proteins;
  • provide specific indexes and documents.

From our efforts to annotate human sequence entries as completely as possible arose the HPI project, and the bacterial model organisms became the focus of the HAMAP project. Here is the current status of the model organisms which are not covered by these two projects:

Organism Database cross-references Index file Number of sequences
A.thaliana None yet arath.txt 3'288
C.albicans None yet calbican.txt 333
C.elegans Wormpep celegans.txt 2'651
D.discoideum DictyBase dicty.txt 324
D.melanogaster FlyBase fly.txt 2'226
M.musculus MGD mgdtosp.txt 9'228
S.cerevisiae SGD yeast.txt 5'090
S.pombe GeneDB_SPombe pombe.txt 2'778

UniProt/Swiss-Prot release statistics
                    
                    1.  INTRODUCTION
                    
                    Release 47.0 of 10-May-2005 of Swiss-Prot contains 181'577 sequence entries,
                    comprising 65'746'672 amino acids abstracted from 128'440 references. 
                    
                    11'531 sequences have been added since release 46, the sequence data of
                    841 existing entries has been updated and the annotations of
                    166'572 entries have been revised. This represents an increase of 6%.
                    
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 7.84   Gln (Q) 3.94   Leu (L) 9.64   Ser (S) 6.85
                    Arg (R) 5.34   Glu (E) 6.61   Lys (K) 5.91   Thr (T) 5.44
                    Asn (N) 4.18   Gly (G) 6.95   Met (M) 2.38   Trp (W) 1.15
                    Asp (D) 5.31   His (H) 2.28   Phe (F) 4.00   Tyr (Y) 3.06
                    Cys (C) 1.54   Ile (I) 5.91   Pro (P) 4.83   Val (V) 6.73
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.01
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Ser, Val, Glu, Lys, Ile, Thr, Arg, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of Swiss-Prot: 9212
                    
                    The first twenty species represent 64219 sequences:  35.4 % of the total
                    number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 4395
                    2x: 1441
                    3x:  721
                    4x:  464
                    5x:  318
                    6x:  275
                    7x:  190
                    8x:  160
                    9x:  135
                    10x:   74
                    11- 20x:  376
                    21- 50x:  298
                    51-100x:  108
                    >100x:  257
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      12202  Homo sapiens (Human)
                    2       9228  Mus musculus (Mouse)
                    3       5090  Saccharomyces cerevisiae (Baker's yeast)
                    4       4842  Escherichia coli
                    5       4300  Rattus norvegicus (Rat)
                    6       3288  Arabidopsis thaliana (Mouse-ear cress)
                    7       2778  Schizosaccharomyces pombe (Fission yeast)
                    8       2777  Bacillus subtilis
                    9       2651  Caenorhabditis elegans
                    10       2226  Drosophila melanogaster (Fruit fly)
                    11       1782  Methanococcus jannaschii
                    12       1773  Haemophilus influenzae
                    13       1738  Escherichia coli O157:H7
                    14       1562  Bos taurus (Bovine)
                    15       1500  Salmonella typhimurium
                    16       1412  Escherichia coli O6
                    17       1400  Mycobacterium tuberculosis
                    18       1383  Shigella flexneri
                    19       1157  Gallus gallus (Chicken)
                    20       1130  Mycobacterium bovis
                    21       1087  Salmonella typhi
                    22       1019  Pseudomonas aeruginosa
                    23        960  Synechocystis sp. (strain PCC 6803)
                    24        960  Archaeoglobus fulgidus
                    25        958  Sus scrofa (Pig)
                    26        945  Xenopus laevis (African clawed frog)
                    27        816  Rhizobium meliloti (Sinorhizobium meliloti)
                    28        803  Vibrio cholerae
                    29        791  Yersinia pestis
                    30        760  Oryctolagus cuniculus (Rabbit)
                    31        745  Aquifex aeolicus
                    32        687  Mycoplasma pneumoniae
                    33        686  Pasteurella multocida
                    34        639  Vibrio parahaemolyticus
                    35        639  Streptomyces coelicolor
                    36        624  Bacillus halodurans
                    37        618  Mycobacterium leprae
                    38        607  Treponema pallidum
                    39        589  Vibrio vulnificus
                    40        579  Canis familiaris (Dog)
                    41        577  Methanobacterium thermoautotrophicum
                    42        577  Anabaena sp. (strain PCC 7120)
                    43        572  Buchnera aphidicola (subsp. Acyrthosiphon pisum) 
                    44        565  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    45        563  Helicobacter pylori (Campylobacter pylori)
                    46        562  Staphylococcus aureus (strain N315)
                    47        561  Buchnera aphidicola (subsp. Schizaphis graminum)
                    48        546  Rickettsia prowazekii
                    49        545  Staphylococcus aureus (strain MW2)
                    50        544  Helicobacter pylori J99 (Campylobacter pylori J99)
                    51        532  Pseudomonas putida (strain KT2440)
                    52        528  Pseudomonas syringae (pv. tomato)
                    53        522  Lactococcus lactis (subsp. lactis) (Streptococcus lactis)
                    54        520  Vibrio vulnificus (strain YJ016)
                    55        517  Zea mays (Maize)
                    56        515  Staphylococcus epidermidis
                    57        507  Buchnera aphidicola (subsp. Baizongia pistaciae)
                    58        506  Ralstonia solanacearum (Pseudomonas solanacearum)
                    59        505  Bacillus anthracis
                    60        505  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    61        500  Listeria monocytogenes
                    62        500  Bradyrhizobium japonicum
                    63        496  Listeria innocua
                    64        495  Rhizobium loti (Mesorhizobium loti)
                    65        487  Xanthomonas campestris (pv. campestris)
                    66        486  Mycoplasma genitalium
                    67        482  Neisseria meningitidis (serogroup B)
                    68        482  Neisseria meningitidis (serogroup A)
                    69        481  Oryza sativa (Rice)
                    70        479  Clostridium acetobutylicum
                    71        467  Caulobacter crescentus
                    72        463  Thermotoga maritima
                    73        450  Xanthomonas axonopodis (pv. citri)
                    74        445  Streptococcus pneumoniae
                    75        444  Photorhabdus luminescens (subsp. laumondii)
                    76        440  Shewanella oneidensis
                    77        440  Xylella fastidiosa
                    78        439  Deinococcus radiodurans
                    79        438  Pan troglodytes (Chimpanzee)
                    80        434  Brachydanio rerio (Zebrafish) (Danio rerio)
                    81        433  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    82        432  Pyrococcus horikoshii
                    83        431  Chlamydia trachomatis
                    84        428  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    85        427  Pyrococcus abyssi
                    86        419  Methanosarcina acetivorans
                    87        417  Borrelia burgdorferi (Lyme disease spirochete)
                    88        417  Brucella suis
                    89        417  Clostridium perfringens
                    90        416  Brucella melitensis
                    91        415  Corynebacterium glutamicum (Brevibacterium flavum)
                    92        412  Chlamydia pneumoniae (Chlamydophila pneumoniae)
                    93        404  Oceanobacillus iheyensis
                    94        404  Rhizobium sp. (strain NGR234)
                    95        403  Staphylococcus aureus (strain MRSA252)
                    96        402  Chlamydia muridarum
                    97        402  Methanosarcina mazei (Methanosarcina frisia)
                    98        401  Halobacterium sp. (strain NRC-1 / ATCC 700922 / JCM 11081)
                    99        400  Staphylococcus aureus (strain MSSA476)
                    100        390  Pyrococcus furiosus
                    101        386  Thermoanaerobacter tengcongensis
                    102        382  Lactobacillus plantarum
                    103        381  Ovis aries (Sheep)
                    104        381  Sulfolobus solfataricus
                    105        380  Campylobacter jejuni
                    106        380  Neurospora crassa
                    107        371  Streptococcus pyogenes
                    108        369  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    109        368  Nicotiana tabacum (Common tobacco)
                    110        364  Rickettsia conorii
                    111        361  Streptococcus mutans
                    112        357  Synechococcus elongatus (Thermosynechococcus elongatus)
                    113        345  Pongo pygmaeus (Orangutan)
                    114        342  Chlorobium tepidum
                    115        338  Enterococcus faecalis (Streptococcus faecalis)
                    116        337  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    117        336  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
                    118        335  Aeropyrum pernix
                    119        333  Candida albicans (Yeast)
                    120        333  Bordetella pertussis
                    121        328  Streptomyces avermitilis
                    122        327  Bordetella parapertussis
                    123        327  Haemophilus ducreyi
                    124        327  Streptococcus pyogenes (serotype M18)
                    125        325  Chromobacterium violaceum
                    126        324  Dictyostelium discoideum (Slime mold)
                    127        323  Streptococcus pyogenes (serotype M3)
                    128        321  Staphylococcus aureus
                    129        320  Methanopyrus kandleri
                    130        310  Corynebacterium efficiens
                    131        307  Pisum sativum (Garden pea)
                    132        304  Sulfolobus tokodaii
                    133        300  Yersinia pseudotuberculosis
                    134        296  Leptospira interrogans
                    135        293  Nitrosomonas europaea
                    136        291  Thermoplasma acidophilum
                    137        283  Triticum aestivum (Wheat)
                    138        282  Streptococcus agalactiae (serotype V)
                    139        281  Streptococcus agalactiae (serotype III)
                    140        278  Fusobacterium nucleatum (subsp. nucleatum)
                    141        272  Hordeum vulgare (Barley)
                    142        268  Lycopersicon esculentum (Tomato)
                    143        268  Bacteriophage T4
                    144        266  Glycine max (Soybean)
                    145        261  Cavia porcellus (Guinea pig)
                    146        261  Gloeobacter violaceus
                    147        260  Bacillus cereus (strain ATCC 10987)
                    148        257  Thermoplasma volcanium
                    149        256  Solanum tuberosum (Potato)
                    150        256  Pyrobaculum aerophilum
                    151        254  Rhodobacter capsulatus (Rhodopseudomonas capsulata)
                    152        254  Vaccinia virus (strain Copenhagen) (VACV)
                    153        254  Synechococcus sp. (strain WH8102)
                    154        250  Pseudomonas putida
                    155        247  Prochlorococcus marinus (strain MIT 9313)
                    156        245  Prochlorococcus marinus
                    157        244  Coxiella burnetii
                    158        243  Kluyveromyces lactis (Yeast)
                    159        242  Spinacia oleracea (Spinach)
                    160        242  Macaca mulatta (Rhesus macaque)
                    161        242  Clostridium tetani
                    162        241  Ureaplasma parvum (Ureaplasma urealyticum biotype 1)
                    163        241  Erwinia carotovora (subsp. atroseptica) (Pectobacterium atrosepticum)
                    164        236  Bacteroides thetaiotaomicron
                    165        233  Bacillus stearothermophilus
                    166        233  Prochlorococcus marinus subsp. pastoris (strain CCMP 1378 / MED4)
                    167        231  Rhodopseudomonas palustris
                    168        228  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    169        225  Wolinella succinogenes
                    170        225  Wigglesworthia glossinidia brevipalpis
                    171        224  Equus caballus (Horse)
                    172        224  Chlamydophila caviae
                    173        220  Porphyra purpurea
                    174        220  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    175        214  Leptospira interrogans (serogroup Icterohaemorrhagiae / serovar Copenhageni)
                    176        213  Chlamydomonas reinhardtii
                    177        212  Bifidobacterium longum
                    178        209  Klebsiella pneumoniae
                    179        205  Listeria monocytogenes (serotype 4b / strain F2365)
                    180        204  Porphyromonas gingivalis (Bacteroides gingivalis)
                    181        204  Rhodopirellula baltica
                    182        203  Mycobacterium paratuberculosis
                    183        200  Acinetobacter sp. (strain ADP1)
                    184        200  Vaccinia virus (strain Western Reserve / WR) (VACV)
                    
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    Kingdom        sequences (% of the database)
                    Archaea            9277 (  5%)
                    Bacteria          82443 ( 45%)
                    Eukaryota         80554 ( 44%)
                    Viruses            9303 (  5%)
                    
                    
                    Within Eukaryota:
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  12203 ( 15%)           (  7%)
                    Other Mammalia         23783 ( 30%)           ( 13%)
                    Other Vertebrata        7207 (  9%)           (  4%)
                    Viridiplantae          12609 ( 16%)           (  7%)
                    Fungi                  11668 ( 14%)           (  6%)
                    Insecta                 4327 (  5%)           (  2%)
                    Nematoda                2930 (  4%)           (  2%)
                    Other                   5827 (  7%)           (  3%)
                    
                    
                    4.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    3796             1001-1100     1494
                    51- 100   12862             1101-1200     1068
                    101- 150   18457             1201-1300      767
                    151- 200   17580             1301-1400      591
                    201- 250   18107             1401-1500      454
                    251- 300   15445             1501-1600      290
                    301- 350   16195             1601-1700      216
                    351- 400   14600             1701-1800      162
                    401- 450   11251             1801-1900      177
                    451- 500    9483             1901-2000      141
                    501- 550    7016             2001-2100       86
                    551- 600    4881             2101-2200      131
                    601- 650    4022             2201-2300      115
                    651- 700    2888             2301-2400       76
                    701- 750    2437             2401-2500       63
                    751- 800    2037             >2500          461
                    801- 850    1633
                    851- 900    1804
                    901- 950    1267
                    951-1000    1040
                    
                    
                    The average sequence length in Swiss-Prot is 362 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  SYNE1_HUMAN (Q8NF91):  8797 amino acids.
                    
                    
                    5.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of Swiss-Prot: 1579
                    
                    
                    5.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  570
                    2x:  219
                    3x:  108
                    4x:   74
                    5x:   58
                    6x:   31
                    7x:   38
                    8x:   27
                    9x:   22
                    10x:   14
                    11- 20x:  123
                    21- 50x:  127
                    51-100x:   55
                    >100x:  113
                    
                    
                    5.2  List of the most cited journals in Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        11906   Journal of Biological Chemistry
                    2         6037   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         4124   Journal of Bacteriology
                    4         3852   Gene
                    5         3833   Nucleic Acids Research
                    6         3227   Biochemical and Biophysical Research Communications
                    7         3188   FEBS Letters
                    8         2861   Biochemistry
                    9         2776   European Journal of Biochemistry
                    10         2674   The EMBO Journal
                    11         2443   Nature
                    12         2410   Biochimica et Biophysica Acta
                    13         2180   Journal of Molecular Biology
                    14         2076   Genomics
                    15         2006   Molecular and Cellular Biology
                    16         1960   Cell
                    17         1567   Biochemical Journal
                    18         1458   Science
                    19         1302   Molecular Microbiology
                    20         1235   Plant Molecular Biology
                    21         1225   Molecular and General Genetics
                    22         1001   Journal of Biochemistry
                    23          981   Journal of Cell Biology
                    24          943   Virology
                    25          927   Human Molecular Genetics
                    26          857   Nature Genetics
                    27          797   Genes and Development
                    28          796   Journal of Virology
                    29          744   The American Journal of Human Genetics
                    30          743   Oncogene
                    31          720   Plant Physiology
                    32          708   Human Mutation
                    33          648   Journal of Immunology
                    34          635   Infection and Immunity
                    35          623   Archives of Biochemistry and Biophysics
                    36          615   Yeast
                    37          610   Structure
                    38          567   Development
                    39          561   Journal of General Virology
                    40          539   Microbiology
                    41          521   Genetics
                    42          507   FEMS Microbiology Letters
                    43          492   Nature Structural Biology
                    44          448   Human Genetics
                    45          448   Blood
                    46          443   Current Genetics
                    47          387   Molecular and Biochemical Parasitology
                    48          384   Applied and Environmental Microbiology
                    49          378   Molecular Biology of the Cell
                    50          372   Journal of Clinical Investigation
                    51          363   Developmental Biology
                    52          359   Mammalian Genome
                    53          356   Cancer Research
                    54          353   Molecular Endocrinology
                    55          352   The Plant Cell
                    56          351   Protein Science
                    57          338   Acta Crystallographica, Section D
                    58          334   Journal of Cell Science
                    59          333   Immunogenetics
                    60          333   Mechanisms of Development
                    61          332   Neuron
                    62          324   The Journal of Experimental Medicine
                    63          320   Journal of Molecular Evolution
                    64          311   DNA and Cell Biology
                    65          305   The Plant Journal
                    66          292   Journal of Neuroscience
                    67          286   Endocrinology
                    68          282   Biological Chemistry Hoppe-Seyler
                    69          273   DNA Sequence
                    70          263   Molecular Cell
                    71          260   Journal of Neurochemistry
                    72          249   Molecular Biology and Evolution
                    73          247   The Journal of Clinical Endocrinology and Metabolism
                    74          245   Current Biology
                    75          239   Journal of General Microbiology
                    76          239   Brain Research. Molecular Brain Research
                    77          232   Toxicon
                    78          229   Bioscience, Biotechnology, and Biochemistry
                    79          222   American Journal of Physiology
                    80          221   Cytogenetics and Cell Genetics
                    81          214   Comparative Biochemistry and Physiology
                    82          214   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    83          186   Molecular Pharmacology
                    84          185   Antimicrobial Agents and Chemotherapy
                    85          173   Proteins
                    86          172   Journal of Investigative Dermatology
                    87          163   Journal of Medical Genetics
                    88          158   DNA Research
                    89          158   DNA
                    90          155   Peptides
                    91          154   Molecular Plant-Microbe Interactions
                    92          152   Genome Research
                    93          152   Virus Research
                    94          150   American Journal of Medical Genetics
                    95          148   Tissue Antigens
                    96          143   Biochimie
                    97          139   Biology of Reproduction
                    98          138   Bioorganicheskaia Khimiia
                    99          135   Hemoglobin
                    100          134   European Journal of Immunology
                    101          130   Molecular and Cellular Endocrinology
                    102          130   Plant and Cell Physiology
                    103          117   Insect Biochemistry and Molecular Biology
                    104          116   Agricultural and Biological Chemistry
                    105          114   Archives of Microbiology
                    106          114   Molecular Phylogenetics and Evolution
                    107          107   General and Comparative Endocrinology
                    108          107   Annals of Neurology
                    109          104   European Journal of Human Genetics
                    110          103   Diabetes
                    111          103   Experimental Cell Research
                    112          102   Journal of Human Genetics
                    113          102   Neurology
                    
                    
                    6.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                     354347              1.95
                    Journal                          314613    170221    1.73
                    Submitted to EMBL/GenBank/DDBJ    36948     31617    0.20
                    Submitted to Swiss-Prot             646       643   <0.01
                    Plant Gene Register                 500       488   <0.01
                    Book citation                       490       478   <0.01
                    Unpublished observations            397       393   <0.01
                    Thesis                              288       286   <0.01
                    Submitted to other databases        254       250   <0.01
                    Patent                              122       120   <0.01
                    Unpublished results                  83        81   <0.01
                    Worm Breeder's Gazette                6         6   <0.01
                    
                    Comments (CC)                       669562              3.69
                    SIMILARITY                       192885    162260    1.06
                    FUNCTION                         121080    118343    0.67
                    SUBCELLULAR LOCATION              90200     90200    0.50
                    CATALYTIC ACTIVITY                64662     60703    0.36
                    SUBUNIT                           58853     58853    0.32
                    PATHWAY                           32505     29804    0.18
                    COFACTOR                          21977     21977    0.12
                    TISSUE SPECIFICITY                19565     19565    0.11
                    PTM                               12088     10734    0.07
                    MISCELLANEOUS                     10225      9394    0.06
                    DOMAIN                             8227      7239    0.05
                    ALTERNATIVE PRODUCTS               7024      7024    0.04
                    CAUTION                            6313      5604    0.03
                    INDUCTION                          5029      5029    0.03
                    DEVELOPMENTAL STAGE                4666      4666    0.03
                    INTERACTION                        3083      3083    0.02
                    DISEASE                            2933      2140    0.02
                    ENZYME REGULATION                  2551      2551    0.01
                    MASS SPECTROMETRY                  1754      1532    0.01
                    DATABASE                           1302      1241    0.01
                    BIOPHYSICOCHEMICAL PROPERTIES       961       961    0.01
                    POLYMORPHISM                        504       491   <0.01
                    ALLERGEN                            380       380   <0.01
                    RNA EDITING                         355       355   <0.01
                    TOXIC DOSE                          269       268   <0.01
                    BIOTECHNOLOGY                       116       116   <0.01
                    PHARMACEUTICAL                       55        55   <0.01
                    
                    Features (FT)                      1008341              5.55
                    TRANSMEM                         115652     25159    0.64
                    METAL                             70212     17480    0.39
                    CONFLICT                          67057     23460    0.37
                    TURN                              62464      4662    0.34
                    CARBOHYD                          59729     14988    0.33
                    STRAND                            57266      4165    0.32
                    DISULFID                          54939     14707    0.30
                    TOPO_DOM                          52817     11324    0.29
                    DOMAIN                            48227     25421    0.27
                    HELIX                             45089      4519    0.25
                    ACT_SITE                          41384     24663    0.23
                    REPEAT                            38277      5571    0.21
                    VARIANT                           33268      6451    0.18
                    CHAIN                             29926     24316    0.16
                    NP_BIND                           26120     18212    0.14
                    MOD_RES                           22591     11680    0.12
                    REGION                            21623     10792    0.12
                    SIGNAL                            19085     19083    0.11
                    COMPBIAS                          17474      9465    0.10
                    BINDING                           15645     10259    0.09
                    VARSPLIC                          14106      6212    0.08
                    SITE                              11811      6609    0.07
                    ZN_FING                           11593      4480    0.06
                    MUTAGEN                           11144      2918    0.06
                    NON_TER                           10952      8325    0.06
                    MOTIF                              8401      6395    0.05
                    INIT_MET                           8137      8073    0.04
                    PROPEP                             6236      5233    0.03
                    DNA_BIND                           5521      5189    0.03
                    LIPID                              5352      3532    0.03
                    COILED                             4676      2837    0.03
                    PEPTIDE                            3812      1774    0.02
                    TRANSIT                            3214      3184    0.02
                    CA_BIND                            2288       928    0.01
                    NON_CONS                           1052       506    0.01
                    CROSSLNK                            592       480   <0.01
                    UNSURE                              416       170   <0.01
                    SE_CYS                              193       135   <0.01
                    
                    Cross-references (DR)              1842067             10.14
                    InterPro                         362455    164421    2.00
                    EMBL                             350521    173966    1.93
                    Pfam                             211867    155896    1.17
                    PROSITE                          163750    101742    0.90
                    PIR                               92542     85789    0.51
                    GO83471     23262    0.46
                    HSSP                              71939     71939    0.40
                    PRINTS                            67421     52265    0.37
                    TIGRFAMs                          64396     60065    0.35
                    HAMAP                             58792     58682    0.32
                    ProDom                            49046     47123    0.27
                    SMART                             44819     34097    0.25
                    Ensembl                           34859     34856    0.19
                    PDB                               29940      8125    0.16
                    SMR                               23589     23589    0.13
                    TIGR                              17791     17285    0.10
                    PIRSF                             13545     13348    0.07
                    Genew                             11320     11263    0.06
                    MIM                               10709      8803    0.06
                    MGI8850      8809    0.05
                    PANTHER                            7807      7795    0.04
                    SGD5140      5079    0.03
                    GermOnline                         4927      4877    0.03
                    EcoGene                            4225      4223    0.02
                    EchoBASE                           4159      4127    0.02
                    IntAct                             3946      3946    0.02
                    MEROPS                             3726      3615    0.02
                    H-InvDB                            3677      3659    0.02
                    WormPep                            3036      2648    0.02
                    RGD3010      3007    0.02
                    FlyBase                            2825      2797    0.02
                    GeneDB_Spombe                      2806      2776    0.02
                    TRANSFAC                           2749      2465    0.02
                    SubtiList                          2727      2726    0.02
                    WormBase                           2710      2635    0.01
                    StyGene                            1454      1451    0.01
                    TubercuList                        1428      1392    0.01
                    SWISS-2DPAGE                       1132      1132    0.01
                    ListiList                           997       989    0.01
                    GeneFarm                            952       948    0.01
                    Reactome                            720       720   <0.01
                    Gramene                             641       609   <0.01
                    Leproma                             622       618   <0.01
                    PhotoList                           444       444   <0.01
                    ZFIN434       427   <0.01
                    MaizeDB                             423       418   <0.01
                    HIV370       365   <0.01
                    REBASE                              368       363   <0.01
                    OGP365       365   <0.01
                    ECO2DBASE                           351       299   <0.01
                    DictyBase                           325       323   <0.01
                    GlycoSuiteDB                        283       283   <0.01
                    SagaList                            282       281   <0.01
                    PHCI-2DPAGE                         239       239   <0.01
                    AGD226       220   <0.01
                    LegioList                           184       184   <0.01
                    MypuList                            173       173   <0.01
                    Aarhus/Ghent-2DPAGE                 128        98   <0.01
                    Siena-2DPAGE                        103       103   <0.01
                    HSC-2DPAGE                           85        85   <0.01
                    COMPLUYEAST-2DPAGE                   59        59   <0.01
                    PhosSite                             54        54   <0.01
                    PMMA-2DPAGE                          52        52   <0.01
                    Maize-2DPAGE                         39        39   <0.01
                    Rat-heart-2DPAGE                     28        28   <0.01
                    ANU-2DPAGE                           14        14   <0.01
                    
                    Number of explicitly cross-referenced databases: 67
                    Number of implicitly cross-referenced databases: 31
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in Swiss-Prot: 201756
                    
                    Total number of entries encoded on a chloroplast: 4293
                    Total number of entries encoded on a mitochondrion: 3318
                    Total number of entries encoded on a cyanelle: 145
                    Total number of entries encoded on a plasmid: 3019
                    
                    Number of fragments: 8484
                    Number of additional sequences encoded on splice variants: 10767
                    
                

UniProt/TrEMBL protein database release 30.0 statistics

                    
                    1.  INTRODUCTION
                    
                    Release 30.0 of 10-May-2005 of UniProt/TrEMBL has been produced in synch
                    with UniProt/Swiss-Prot release 47 and EMBL/DDBJ/GenBank nucleotide
                    sequence database release 81 and updates until the 16-April-2005. It contains 
                    1'714'475 sequence entries, comprising 540'729'498 amino acids.
                    
                    149'924 sequences have been added since release 29. This represents an 
                    increase of 11.24%.
                    
                    In the document delac_tr.txt, you will find a list of all accession numbers
                    which were previously present in UniProt/TrEMBL, but which have now been
                    deleted from the database. Most deletions are due to the deletion of the
                    corresponding CDS in the source nucleotide sequence databases EMBL-
                    Bank/DDBJ/GenBank. In addition, some entries are recognised to be Open
                    Reading frames (ORFs) that have been wrongly predicted to code for proteins.
                    When there is enough evidence that these hypothetical proteins are not real,
                    we take the decision to remove them from TrEMBL. 
                    
                    
                    2.  AMINO ACID COMPOSITION
                    
                    2.1  Composition in percent for the complete database
                    
                    Ala (A) 7.72   Gln (Q) 3.88   Leu (L) 9.73   Ser (S) 7.10
                    Arg (R) 5.30   Glu (E) 6.08   Lys (K) 5.56   Thr (T) 5.73
                    Asn (N) 4.47   Gly (G) 6.89   Met (M) 2.41   Trp (W) 1.37
                    Asp (D) 5.10   His (H) 2.27   Phe (F) 4.14   Tyr (Y) 3.14
                    Cys (C) 1.50   Ile (I) 6.03   Pro (P) 4.94   Val (V) 6.48
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.07
                    
                    
                    2.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Ser, Gly, Val, Glu, Ile, Thr, Lys, Arg, Asp, Pro, Asn, Phe,
                    Gln, Tyr, Met, His, Cys, Trp
                    
                    
                    3.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of 
                    UniProt/TrEMBL: 89807
                    
                    The first twenty species represent 499903 sequences: 29.2 % of the
                    total number of entries.
                    
                    
                    3.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x:44301
                    2x:17044
                    3x: 8565
                    4x: 4547
                    5x: 2681
                    6x: 2006
                    7x: 1349
                    8x: 1150
                    9x:  935
                    10x:  816
                    11- 20x: 2983
                    21- 50x: 1787
                    51-100x:  721
                    >100x:  922
                    
                    
                    3.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1     126858  Human immunodeficiency virus 1
                    2      56039  Homo sapiens (Human)
                    3      47506  Oryza sativa (japonica cultivar-group)
                    4      39737  Arabidopsis thaliana (Mouse-ear cress)
                    5      38848  Mus musculus (Mouse)
                    6      24423  Drosophila melanogaster (Fruit fly)
                    7      22971  Hepatitis C virus
                    8      20188  Caenorhabditis elegans
                    9      15226  Anopheles gambiae str. PEST
                    10      13201  Caenorhabditis briggsae
                    11      12210  Brachydanio rerio (Zebrafish) (Danio rerio)
                    12      10974  Neurospora crassa
                    13      10718  Xenopus laevis (African clawed frog)
                    14       9678  Schistosoma japonicum (Blood fluke)
                    15       9528  Aspergillus nidulans FGSC A4
                    16       9240  Candida albicans SC5314
                    17       9048  Rattus norvegicus (Rat)
                    18       8142  Bradyrhizobium japonicum
                    19       7802  Plasmodium yoelii yoelii
                    20       7566  Streptomyces coelicolor
                    21       7397  Hepatitis B virus
                    22       7379  Streptomyces avermitilis
                    23       7183  Rhizobium loti (Mesorhizobium loti)
                    24       7081  uncultured bacterium
                    25       7067  Rhodopirellula baltica
                    26       7006  Escherichia coli
                    27       7006  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    28       6575  Cryptococcus neoformans (Filobasidiella neoformans)
                    29       6493  Pseudomonas aeruginosa
                    30       6463  Yarrowia lipolytica (Candida lipolytica)
                    31       6394  Giardia lamblia ATCC 50803
                    32       6275  Bacillus anthracis
                    33       6241  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    34       5803  Nocardia farcinica
                    35       5747  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    36       5696  Rhizobium meliloti (Sinorhizobium meliloti)
                    37       5565  Anabaena sp. (strain PCC 7120)
                    38       5560  Bacillus cereus (strain ATCC 10987)
                    39       5374  Gallus gallus (Chicken)
                    40       5228  Plasmodium falciparum (isolate 3D7)
                    41       5217  Yersinia pestis
                    42       5197  Trypanosoma brucei
                    43       5195  Kluyveromyces lactis (Yeast)
                    44       5136  Helicobacter pylori (Campylobacter pylori)
                    45       5121  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    46       5106  Candida glabrata (Yeast) (Torulopsis glabrata)
                    47       4969  Pseudomonas syringae (pv. tomato)
                    48       4964  Bacillus cereus (strain ZK)
                    49       4947  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    50       4894  Bacillus thuringiensis (subsp. konkukian)
                    51       4889  Escherichia coli O157:H7
                    52       4867  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    53       4806  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    54       4781  Pseudomonas putida (strain KT2440)
                    55       4772  Bacteroides fragilis
                    56       4728  Ralstonia solanacearum (Pseudomonas solanacearum)
                    57       4629  Xanthomonas oryzae (pv. oryzae)
                    58       4616  Rhodopseudomonas palustris
                    59       4607  Bacteroides thetaiotaomicron
                    60       4586  Leptospira interrogans
                    61       4530  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    62       4470  Vibrio vulnificus (strain YJ016)
                    63       4441  Azoarcus sp. (strain EbN1)
                    64       4407  Oryza sativa (Rice)
                    65       4404  Pongo pygmaeus (Orangutan)
                    66       4404  Burkholderia mallei (Pseudomonas mallei)
                    67       4396  Vibrio parahaemolyticus
                    68       4319  Mycobacterium tuberculosis
                    69       4247  Erwinia carotovora (subsp. atroseptica) (Pectobacterium atrosepticum)
                    70       4241  Salmonella enterica subsp. enterica serovar Choleraesuis str. SC-B67
                    71       4212  Mycobacterium paratuberculosis
                    72       4181  Silicibacter pomeroyi
                    73       4150  Gloeobacter violaceus
                    74       4146  Shewanella oneidensis
                    75       4115  Photorhabdus luminescens (subsp. laumondii)
                    76       4105  Haloarcula marismortui (Halobacterium marismortui)
                    77       4086  Chromobacterium violaceum
                    78       4060  Corynebacterium glutamicum (Brevibacterium flavum)
                    79       4058  Methanosarcina acetivorans
                    80       4034  Plasmodium falciparum
                    81       4031  Cryptosporidium parvum
                    82       4027  Salmonella typhi
                    83       4027  Vibrio vulnificus
                    84       3989  Vibrio cholerae
                    85       3978  Cryptosporidium hominis
                    86       3958  Salmonella paratyphi-a
                    87       3941  Bacillus clausii (strain KSM-K16)
                    88       3938  Yersinia pseudotuberculosis
                    89       3927  Shigella flexneri
                    90       3926  Escherichia coli O6
                    91       3911  Xanthomonas axonopodis (pv. citri)
                    92       3850  Bordetella parapertussis
                    93       3769  Vibrio fischeri (strain ATCC 700601 / ES114)
                    94       3768  Listeria monocytogenes
                    95       3755  Bos taurus (Bovine)
                    96       3753  Salmonella typhimurium
                    97       3720  Xanthomonas campestris (pv. campestris)
                    98       3588  Enterococcus faecalis (Streptococcus faecalis)
                    99       3562  Bacillus halodurans
                    100       3539  Streptococcus pneumoniae
                    101       3501  Torque teno virus
                    102       3477  Bdellovibrio bacteriovorus
                    103       3436  Leptospira interrogans (serogroup Icterohaemorrhagiae / serovar Copenhageni)
                    104       3407  Clostridium acetobutylicum
                    105       3407  Geobacillus kaustophilus
                    106       3325  Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303)
                    107       3321  Caulobacter crescentus
                    108       3290  Chimpanzee immunodeficiency virus (SIV(cpz)) (CIV)
                    109       3225  Dictyostelium discoideum (Slime mold)
                    110       3214  Geobacter sulfurreducens
                    111       3213  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    112       3192  Symbiobacterium thermophilum
                    113       3140  Acinetobacter sp. (strain ADP1)
                    114       3113  Desulfotalea psychrophila
                    115       3104  Streptococcus pyogenes
                    116       3086  Oceanobacillus iheyensis
                    117       3076  Brucella abortus biovar 1 str. 9-941
                    118       3050  Legionella pneumophila (strain Paris)
                    119       3033  Bordetella pertussis
                    
                    
                    
                    3.3  Taxonomic distribution of the sequences
                    
                    Kingdom        sequences (% of the database)
                    Archaea           45239 (  3%)
                    Bacteria         633961 ( 37%)
                    Eukaryota        728869 ( 43%)
                    Viruses          304257 ( 18%)
                    Other              2149 ( <1%)
                    
                    Within Eukaryota:
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  56039 (  8%)           (  3%)
                    Other Mammalia         92879 ( 13%)           (  5%)
                    Other Vertebrata       92476 ( 13%)           (  5%)
                    Viridiplantae         177898 ( 24%)           ( 10%)
                    Fungi                  89518 ( 12%)           (  5%)
                    Insecta                85761 ( 12%)           (  5%)
                    Nematoda               35993 (  5%)           (  2%)
                    Other                  98305 ( 13%)           (  6%)
                    
                    
                    
                    4.  SEQUENCE SIZE
                    
                    4.1  Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50   21155             1001-1100     9713
                    51- 100  102798             1101-1200     6992
                    101- 150  128388             1201-1300     5303
                    151- 200  117224             1301-1400     3404
                    201- 250  118574             1401-1500     2846
                    251- 300  109753             1501-1600     1953
                    301- 350  107083             1601-1700     1551
                    351- 400   86474             1701-1800     1314
                    401- 450   67185             1801-1900     1058
                    451- 500   58853             1901-2000      876
                    501- 550   46252             2001-2100      675
                    551- 600   32250             2101-2200      817
                    601- 650   24775             2201-2300      679
                    651- 700   19308             2301-2400      536
                    701- 750   16492             2401-2500      377
                    751- 800   13713             >2500         3339
                    801- 850   11539
                    851- 900   10187
                    901- 950    7584
                    951-1000    6052
                    
                    
                    
                    
                    4.2  Longest and shortest sequences
                    
                    The shortest sequence is Q16047_HUMAN:     4 amino acids.
                    The longest sequence is  Q8WZ42_HUMAN: 34350 amino acids.
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProt/TrEMBL 
                    lines, as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ---------------------------------  -------- ---------  ---------
                    
                    References (RL)                    2369863              1.38
                    Journal                         1496311   1260307    0.87
                    Submitted to EMBL/GenBank/DDBJ   860323    677541    0.50
                    Thesis                             4686      4634   <0.01
                    Book citation                      3792      3748   <0.01
                    Submitted to other databases        448       440   <0.01
                    Other                              4303      4302   <0.01
                    
                    Comments (CC)                       964235              0.56
                    SIMILARITY                       174303    171382    0.10
                    FUNCTION                         172046    170766    0.10
                    CATALYTIC ACTIVITY               170005    152037    0.10
                    SUBCELLULAR LOCATION             159288    159288    0.09
                    SUBUNIT                           92437     92437    0.05
                    CAUTION                           75319     75218    0.04
                    PATHWAY                           54200     53057    0.03
                    COFACTOR                          56256     56256    0.03
                    INTERACTION                        1161      1161   <0.01
                    MISCELLANEOUS                      3581      3564   <0.01
                    DOMAIN                             5309      4658   <0.01
                    ALLERGEN                            172       172   <0.01
                    
                    Features (FT)                      1014689              0.59
                    NON_TER                          960278    565239    0.56
                    CHAIN                             40998     24408    0.02
                    SIGNAL                            12809     12587    0.01
                    TRANSIT                             604       600   <0.01
                    
                    
                    Cross-references (DR)             12951295              7.55
                    GO                              3878298   1080665    2.26
                    InterPro                        2376394   1270589    1.39
                    EMBL                            2019928   1708137    1.18
                    Pfam                            1588390   1198364    0.93
                    PROSITE                          827908    540191    0.48
                    PRINTS                           392823    317615    0.23
                    SMART                            295055    227125    0.17
                    HSSP                             290497    290218    0.17
                    SMR                              248032    247914    0.14
                    ProDom                           204145    195938    0.12
                    PIR                              197872    162162    0.12
                    TIGRFAMs                         181672    168057    0.11
                    TIGR                              92497     86467    0.05
                    Ensembl                           73110     73097    0.04
                    PANTHER                           54609     54599    0.03
                    Gramene                           43206     43193    0.03
                    PIRSF                             32718     31848    0.02
                    FlyBase                           29155     22548    0.02
                    MGI                               24517     24515    0.01
                    WormPep                           19095     19014    0.01
                    WormBase                          19083     19014    0.01
                    ZFIN                               8839      8837    0.01
                    MEROPS                             8517      8250   <0.01
                    LegioList                          5711      5681   <0.01
                    IntAct                             5352      5352   <0.01
                    ListiList                          4818      4801   <0.01
                    AGD4483      4483   <0.01
                    PhotoList                          4236      4112   <0.01
                    Genew                              3264      3264   <0.01
                    PDB2759      1629   <0.01
                    TubercuList                        2494      2488   <0.01
                    RGD2474      2459   <0.01
                    GeneDB_Spombe                      2125      2119   <0.01
                    SagaList                           1812      1718   <0.01
                    SGD1374      1373   <0.01
                    TRANSFAC                           1030      1017   <0.01
                    Leproma                             985       984   <0.01
                    DictyBase                           980       980   <0.01
                    MypuList                            609       605   <0.01
                    REBASE                              125       120   <0.01
                    PHCI-2DPAGE                         108       108   <0.01
                    SWISS-2DPAGE                         87        87   <0.01
                    ANU-2DPAGE                           73        73   <0.01
                    Reactome                             30        30   <0.01
                    PMMA-2DPAGE                           3         3   <0.01
                    Siena-2DPAGE                          2         2   <0.01
                    COMPLUYEAST-2DPAGE                    1         1   <0.01
                    
                    Number of explicitly cross-referenced databases: 68
                    
                    6.  MISCELLANEOUS STATISTICS
                    
                    Total number of distinct authors cited in UniProt/TrEMBL: 210250
                    
                    Total number of entries encoded on a chloroplast: 41294
                    Total number of entries encoded on a mitochondrion: 99934
                    Total number of entries encoded on a cyanelle: 2
                    Total number of entries encoded on a plasmid: 33862
                    
                    Number of fragments: 567403
                    Number of additional sequences encoded on splice variants: 55
                    
                

Submissions and Updates

We welcome feedback from our users. We would especially appreciate your notifying us if you find that sequences belonging to your field of expertise are missing from the database. We also would like to be notified about annotations to be updated, if, for example, the function of a protein has been clarified or if new information about post-translational modifications has become available.

Submit new sequence data, updates and corrections at http://www.uniprot.org/support/submissions.shtml

For all queries regarding submissions to UniProt and to submit new protein sequence data, please contact:

UniProt Knowledgebase
The EMBL Outstation - The European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 462
Telefax: (+44 1223) 494 468
E-mail: datasubs@ebi.ac.uk


Download information

Bi-Weekly releases

The latest data of the UniProt Knowledgebase is available in various format (flatfile, XML or FASTA) at http://www.uniprot.org/database/download.shtml. The data is further supplemented by two files containing the sequences of all additional splice isoforms annotated in UniProt/Swiss-Prot and UniProt/TrEMBL. These data sets are documented in the file ftp://ftp.expasy.org/databases/uniprot/current_release/knowledgebase/complete/README.varsplic

Major releases

For users who wish to download the UniProt Knowledgebase only occasionally, we distribute the latest major release (updated 4 times per year) in flatfile format. Previous UniProt/Swiss-Prot and UniProt/TrEMBL are archived under ftp://ftp.uniprot.org/databases/uniprot/previous_major_releases The UniProt Knowledgebase major release is also available on CD-ROM from the EBI.


Contact

EMBL Outstation
European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

Telephone: (+44 1223) 494 444
Fax: (+44 1223) 494 468
Electronic mail address: datalib@ebi.ac.uk / swissprot@ebi.ac.uk
WWW server: http://www.ebi.ac.uk/


SIB Swiss Institute of Bioinformatics
Centre Medical Universitaire
1, rue Michel Servet
1211 Geneva 4
Switzerland

Telephone: (+41 22) 702 50 50
Fax: (+41 22) 702 58 58
Electronic mail address: Swiss-Prot@expasy.org
WWW server: http://www.expasy.org/


Protein Information Resource (PIR)
Georgetown University Medical Center
3900 Reservoir Road, NW
Box 571455
Washington, DC 20057-1455
United States of America

Telephone: (+1 202) 687 1039
Fax: (+1 202) 687 0057)
Electronic mail address: pirmail@georgetown.edu
WWW server: http://pir.georgetown.edu

Citation

If you want to cite UniProt in a publication please use the following reference:

Bairoch A., Apweiler R., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M., Martin M.J., Natale D.A., O'Donovan C., Redaschi N., Yeh L.S., The Universal Protein Resource (UniProt), Nucleic Acids Res. 33: D154-D159 (2005).