Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
UniProtKB/Swiss-Prot protein knowledgebase release 2011_04 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2011_04 of 05-Apr-11 of UniProtKB/Swiss-Prot contains 526969 sequence entries,
                    comprising 186402391 amino acids abstracted from 196878 references. 
                    
                    1009 sequences have been added since release 2011_03, the sequence data of
                    119 existing entries has been updated and the annotations of
                    131672 entries have been revised.
                    
                    Number of fragments: 8836
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 30094
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        72615   13.8%
                    2: Evidence at transcript level     68623     13%
                    3: Inferred from homology          369536   70.1%
                    4: Predicted                        14342    2.7%
                    5: Uncertain                         1853    0.4%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12439
                    
                    The first twenty species represent 109757 sequences:  20.8 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5307
                    2x: 1777
                    3x:  941
                    4x:  609
                    5x:  447
                    6x:  355
                    7x:  257
                    8x:  215
                    9x:  195
                    10x:  104
                    11- 20x:  623
                    21- 50x:  390
                    51-100x:  195
                    >100x: 1024
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20232  Homo sapiens (Human)
                    2      16354  Mus musculus (Mouse)
                    3      10231  Arabidopsis thaliana (Mouse-ear cress)
                    4       7600  Rattus norvegicus (Rat)
                    5       6596  Saccharomyces cerevisiae (Baker's yeast)
                    6       5833  Bos taurus (Bovine)
                    7       4976  Schizosaccharomyces pombe (Fission yeast)
                    8       4430  Escherichia coli (strain K12)
                    9       4244  Bacillus subtilis
                    10       4161  Dictyostelium discoideum (Slime mold)
                    11       3331  Caenorhabditis elegans
                    12       3299  Xenopus laevis (African clawed frog)
                    13       3115  Drosophila melanogaster (Fruit fly)
                    14       2718  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2657  Oryza sativa subsp. japonica (Rice)
                    16       2214  Pongo abelii (Sumatran orangutan)
                    17       2201  Gallus gallus (Chicken)
                    18       1997  Escherichia coli O157:H7
                    19       1785  Mycobacterium tuberculosis
                    20       1783  Methanocaldococcus jannaschii (Methanococcus jannaschii)
                    21       1778  Salmonella typhimurium
                    22       1707  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
                    23       1674  Shigella flexneri
                    24       1672  Escherichia coli O6
                    25       1592  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1385  Sus scrofa (Pig)
                    27       1342  Salmonella typhi
                    28       1284  Pseudomonas aeruginosa
                    29       1240  Mycobacterium bovis
                    30       1166  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
                    31       1024  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / N-1)
                    32       1000  Yersinia pestis
                    33        994  Archaeoglobus fulgidus
                    34        946  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        913  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        897  Staphylococcus aureus (strain COL)
                    41        895  Staphylococcus aureus (strain MW2)
                    42        889  Staphylococcus aureus (strain MSSA476)
                    43        886  Staphylococcus aureus (strain MRSA252)
                    44        885  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    45        884  Oryctolagus cuniculus (Rabbit)
                    46        881  Salmonella choleraesuis
                    47        876  Shigella sonnei (strain Ss046)
                    48        875  Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056)  
                    49        864  Yersinia pseudotuberculosis
                    50        850  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    51        841  Escherichia coli O9:H4 (strain HS)
                    52        834  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    53        832  Candida albicans (Yeast)
                    54        826  Shigella boydii serotype 4 (strain Sb227)
                    55        823  Escherichia coli (strain UTI89 / UPEC)
                    56        819  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    57        808  Shigella dysenteriae serotype 1 (strain Sd197)
                    58        807  Candida glabrata (Yeast) (Torulopsis glabrata)
                    59        795  Neurospora crassa
                    60        794  Vibrio parahaemolyticus
                    61        790  Escherichia coli (strain SMS-3-5 / SECEC)
                    62        779  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    63        778  Pasteurella multocida
                    64        776  Canis familiaris (Dog) (Canis lupus familiaris)
                    65        773  Aquifex aeolicus
                    66        770  Escherichia coli (strain K12 / DH10B)
                    67        764  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    68        764  Escherichia coli (strain K12 / MC4100 / BW2952)
                    69        762  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    70        762  Escherichia coli (strain 55989 / EAEC)
                    71        761  Escherichia coli O8 (strain IAI1)
                    72        757  Shigella flexneri serotype 5b (strain 8401)
                    73        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    74        756  Escherichia coli (strain SE11)
                    75        756  Staphylococcus epidermidis (strain ATCC 12228)
                    76        756  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    77        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    78        751  Streptomyces coelicolor
                    79        746  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    80        741  Photorhabdus luminescens subsp. laumondii
                    81        740  Emericella nidulans (Aspergillus nidulans)
                    82        732  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    83        731  Escherichia coli O81 (strain ED1a)
                    84        731  Bacillus halodurans
                    85        731  Vibrio vulnificus
                    86        724  Bacillus anthracis
                    87        720  Salmonella enteritidis PT4 (strain P125109)
                    88        716  Vibrio vulnificus (strain YJ016)
                    89        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    90        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    91        713  Staphylococcus aureus (strain NCTC 8325)
                    92        713  Salmonella paratyphi A (strain AKU_12601)
                    93        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    94        711  Salmonella agona (strain SL483)
                    95        711  Escherichia coli O1:K1 / APEC
                    96        711  Salmonella newport (strain SL254)
                    97        710  Salmonella heidelberg (strain SL476)
                    98        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    99        709  Salmonella schwarzengrund (strain CVM19633)
                    100        706  Enterobacter sp. (strain 638)
                    101        705  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    102        700  Salmonella dublin (strain CT_02021853)
                    103        697  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        691  Klebsiella pneumoniae (strain 342)
                    105        687  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    106        687  Mycoplasma pneumoniae
                    107        686  Pan troglodytes (Chimpanzee)
                    108        685  Nostoc sp. (strain PCC 7120 / UTEX 2576)
                    109        684  Pseudomonas syringae pv. tomato
                    110        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    111        672  Pseudomonas putida (strain KT2440)
                    112        670  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    113        668  Mycobacterium leprae
                    114        667  Zea mays (Maize)
                    115        666  Staphylococcus aureus (strain USA300)
                    116        666  Yersinia pestis (strain Pestoides F)
                    117        662  Serratia proteamaculans (strain 568)
                    118        658  Rhizobium sp. (strain NGR234)
                    119        650  Bradyrhizobium japonicum
                    120        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    121        640  Escherichia coli
                    122        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    123        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    124        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    125        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    126        630  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    127        626  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    128        621  Shewanella oneidensis
                    129        618  Yarrowia lipolytica (Candida lipolytica)
                    130        615  Treponema pallidum
                    131        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    132        613  Enterobacter sakazakii (strain ATCC BAA-894)
                    133        611  Staphylococcus haemolyticus (strain JCSC1435)
                    134        605  Methanobacterium thermoautotrophicum
                    135        605  Rhizobium loti (Mesorhizobium loti)
                    136        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    137        600  Yersinia pestis bv. Antiqua (strain Angola)
                    138        600  Salmonella paratyphi C (strain RKS4594)
                    139        598  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    140        598  Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) 
                    141        597  Listeria monocytogenes
                    142        590  Bacillus cereus (strain ATCC 10987)
                    143        590  Xanthomonas campestris pv. campestris
                    144        588  Listeria innocua
                    145        585  Rickettsia prowazekii
                    146        585  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    147        584  Helicobacter pylori (Campylobacter pylori)
                    148        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    149        579  Neisseria meningitidis serogroup B
                    150        576  Brucella suis
                    151        572  Brucella melitensis
                    152        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    153        569  Bacillus thuringiensis subsp. konkukian
                    154        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    155        565  Pseudomonas syringae pv. syringae (strain B728a)
                    156        562  Buchnera aphidicola subsp. Schizaphis graminum
                    157        560  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    158        560  Bacillus cereus (strain ZK / E33L)
                    159        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    160        556  Neisseria meningitidis serogroup A
                    161        556  Clostridium acetobutylicum
                    162        556  Xanthomonas axonopodis pv. citri (Citrus canker)
                    163        554  Vibrio fischeri (strain ATCC 700601 / ES114)
                    164        553  Oryza sativa subsp. indica (Rice)
                    165        552  Pseudomonas fluorescens (strain Pf0-1)
                    166        551  Caenorhabditis briggsae
                    167        549  Oceanobacillus iheyensis
                    168        547  Caulobacter crescentus (Caulobacter vibrioides)
                    169        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    170        543  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    171        529  Listeria monocytogenes serotype 4b (strain F2365)
                    172        526  Sodalis glossinidius (strain morsitans)
                    173        526  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    174        522  Xylella fastidiosa
                    175        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    176        519  Streptococcus pneumoniae
                    177        513  Chromobacterium violaceum
                    178        513  Thermotoga maritima
                    179        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    180        510  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    181        507  Bordetella parapertussis
                    182        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    183        507  Pseudomonas aeruginosa (strain PA7)
                    184        507  Haemophilus ducreyi
                    185        506  Bordetella pertussis
                    186        505  Staphylococcus aureus (strain Newman)
                    187        505  Geobacillus kaustophilus
                    188        500  Pseudomonas entomophila (strain L48)
                    189        499  Deinococcus radiodurans
                    190        498  Brucella abortus
                    191        497  Rickettsia conorii
                    192        496  Bacillus clausii (strain KSM-K16)
                    193        493  Corynebacterium glutamicum (Brevibacterium flavum)
                    194        493  Streptomyces avermitilis
                    195        492  Haemophilus influenzae (strain 86-028NP)
                    196        491  Xanthomonas campestris pv. campestris (strain 8004)
                    197        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    198        490  Clostridium perfringens
                    199        488  Bacillus amyloliquefaciens (strain FZB42)
                    200        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    201        487  Shewanella sp. (strain MR-7)
                    202        484  Mannheimia succiniciproducens (strain MBEL55E)
                    203        484  Pseudomonas aeruginosa (strain LESB58)
                    204        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    205        484  Shewanella sp. (strain MR-4)
                    206        483  Proteus mirabilis (strain HI4320)
                    207        483  Methanosarcina acetivorans
                    208        483  Mycoplasma genitalium
                    209        475  Acinetobacter sp. (strain ADP1)
                    210        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    211        474  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    212        474  Thermosynechococcus elongatus (strain BP-1)
                    213        473  Pseudomonas putida (strain F1 / ATCC 700007)
                    214        472  Brucella abortus (strain 2308)
                    215        470  Enterococcus faecalis (Streptococcus faecalis)
                    216        468  Pyrococcus horikoshii 
                    217        466  Rhodopseudomonas palustris
                    218        466  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    219        465  Pseudomonas putida (strain GB-1)
                    220        464  Shewanella frigidimarina (strain NCIMB 400)
                    221        462  Lactobacillus plantarum
                    222        462  Pyrococcus abyssi (strain GE5 / Orsay)
                    223        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    224        462  Methanosarcina mazei (Methanosarcina frisia)
                    225        462  Shewanella sp. (strain ANA-3)
                    226        461  Burkholderia mallei (Pseudomonas mallei)
                    227        461  Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337) 
                    228        459  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    229        458  Cupriavidus pinatubonensis (strain JMP134 / LMG 1197) (Alcaligenes eutrophus) 
                    230        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    231        455  Staphylococcus aureus (strain JH1)
                    232        455  Halobacterium salinarium (Halobacterium halobium)
                    233        454  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    234        453  Rickettsia felis (Rickettsia azadi)
                    235        452  Shewanella baltica (strain OS185)
                    236        452  Pseudomonas putida (strain W619)
                    237        451  Ovis aries (Sheep)
                    238        449  Staphylococcus aureus (strain JH9)
                    239        449  Methylococcus capsulatus
                    240        449  Streptococcus mutans
                    241        449  Aeromonas salmonicida (strain A449)
                    242        448  Thermoanaerobacter tengcongensis
                    243        448  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    244        448  Aspergillus oryzae (strain ATCC 42149 / RIB 40)
                    245        447  Vibrio fischeri (strain MJ11)
                    246        446  Mycobacterium paratuberculosis
                    247        444  Hahella chejuensis (strain KCTC 2396)
                    248        444  Pseudomonas mendocina (strain ymp)
                    249        444  Dechloromonas aromatica (strain RCB)
                    250        443  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18483 (  4%)
                    Bacteria         325615 ( 62%)
                    Eukaryota        167202 ( 32%)
                    Viruses           15669 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20233 ( 12%)           (  4%)
                    Other Mammalia         45079 ( 27%)           (  9%)
                    Other Vertebrata       16583 ( 10%)           (  3%)
                    Viridiplantae          30821 ( 18%)           (  6%)
                    Fungi                  28129 ( 17%)           (  5%)
                    Insecta                 8226 (  5%)           (  2%)
                    Nematoda                4185 (  3%)           (  1%)
                    Other                  13946 (  8%)           (  3%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8642             1001-1100     3599
                    51- 100   40607             1101-1200     2492
                    101- 150   56640             1201-1300     1969
                    151- 200   56620             1301-1400     1817
                    201- 250   55433             1401-1500     1443
                    251- 300   48652             1501-1600      692
                    301- 350   49075             1601-1700      531
                    351- 400   42308             1701-1800      436
                    401- 450   34677             1801-1900      407
                    451- 500   27765             1901-2000      331
                    501- 550   19682             2001-2100      204
                    551- 600   14090             2101-2200      274
                    601- 650   11854             2201-2300      282
                    651- 700    8542             2301-2400      169
                    701- 750    7055             2401-2500      131
                    751- 800    5027             >2500         1046
                    801- 850    4386
                    851- 900    4911
                    901- 950    3723
                    951-1000    2621
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 353 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2124
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  689
                    2x:  278
                    3x:  151
                    4x:  102
                    5x:   94
                    6x:   69
                    7x:   33
                    8x:   44
                    9x:   31
                    10x:   23
                    11- 20x:  171
                    21- 50x:  176
                    51-100x:   99
                    >100x:  164
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18630   Journal of Biological Chemistry
                    2         8626   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5168   Journal of Bacteriology
                    4         4662   Biochemical and Biophysical Research Communications
                    5         4535   Gene
                    6         4353   Nucleic Acids Research
                    7         4059   FEBS Letters
                    8         4027   Biochemistry
                    9         3866   The EMBO Journal
                    10         3536   Molecular and Cellular Biology
                    11         3371   Nature
                    12         3202   Journal of Molecular Biology
                    13         3137   European Journal of Biochemistry
                    14         3024   Biochimica et Biophysica Acta
                    15         2787   Cell
                    16         2484   Genomics
                    17         2233   Biochemical Journal
                    18         2205   Science
                    19         2147   Journal of Virology
                    20         1822   Molecular Microbiology
                    21         1633   Journal of Cell Biology
                    22         1533   Plant Molecular Biology
                    23         1469   Plant Physiology
                    24         1434   Genes and Development
                    25         1400   Virology
                    26         1363   Human Molecular Genetics
                    27         1356   Nature Genetics
                    28         1342   The American Journal of Human Genetics
                    29         1310   Molecular and General Genetics
                    30         1234   Oncogene
                    31         1210   Development
                    32         1185   Journal of Biochemistry
                    33         1151   Human Mutation
                    34         1083   Molecular Biology of the Cell
                    35         1038   Journal of Immunology
                    36         1030   The Plant Cell
                    37         1023   Genetics
                    38          940   Journal of General Virology
                    39          931   Structure
                    40          912   Molecular Cell
                    41          899   Infection and Immunity
                    42          865   The Plant Journal
                    43          845   Archives of Biochemistry and Biophysics
                    44          824   Blood
                    45          782   Journal of Cell Science
                    46          778   Microbiology
                    47          764   Yeast
                    48          754   Developmental Biology
                    49          694   Cancer Research
                    50          692   Current Biology
                    51          679   FEMS Microbiology Letters
                    52          603   Nature Structural Biology
                    53          601   Mechanisms of Development
                    54          597   Human Genetics
                    55          582   Acta Crystallographica, Section D
                    56          577   Protein Science
                    57          561   Applied and Environmental Microbiology
                    58          555   Journal of Neuroscience
                    59          535   Toxicon
                    60          529   Current Genetics
                    61          524   Neuron
                    62          519   Journal of Clinical Investigation
                    63          474   American Journal of Physiology
                    64          473   Mammalian Genome
                    65          456   The Journal of Experimental Medicine
                    66          451   Immunogenetics
                    67          449   Molecular Endocrinology
                    68          424   Molecular and Biochemical Parasitology
                    69          418   Journal of Neurochemistry
                    70          412   The Journal of Clinical Endocrinology and Metabolism
                    71          407   Endocrinology
                    72          396   Proteins
                    73          382   Journal of Molecular Evolution
                    74          380   Bioscience, Biotechnology, and Biochemistry
                    75          367   Journal of Medical Genetics
                    76          367   DNA and Cell Biology
                    77          362   Molecular Biology and Evolution
                    78          358   DNA Sequence
                    79          355   Plant and Cell Physiology
                    80          335   Nature Cell Biology
                    81          321   Tissue Antigens
                    82          316   Experimental Cell Research
                    83          316   Peptides
                    84          316   Brain Research. Molecular Brain Research
                    85          301   Comparative Biochemistry and Physiology
                    86          291   Biological Chemistry Hoppe-Seyler
                    87          290   Antimicrobial Agents and Chemotherapy
                    88          285   Journal of Investigative Dermatology
                    89          276   Cytogenetics and Cell Genetics
                    90          273   Molecular Pharmacology
                    91          268   Developmental Cell
                    92          267   Biology of Reproduction
                    93          259   RNA
                    94          253   Genome Research
                    95          253   Neurology
                    96          252   Virus Research
                    97          251   Journal of General Microbiology
                    98          247   Developmental Dynamics
                    99          233   Molecular Plant-Microbe Interactions
                    100          233   Planta
                    101          218   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    102          215   Nature Structural and Molecular Biology
                    103          212   Genes to Cells
                    104          210   Annals of Neurology
                    105          208   DNA Research
                    106          208   European Journal of Immunology
                    107          207   Biochimie
                    108          204   Immunity
                    109          204   The FEBS Journal
                    110          202   Eukaryotic cell
                    111          200   The New England Journal of Medicine
                    112          200   European Journal of Human Genetics
                    113          192   EMBO Reports
                    114          189   Journal of Human Genetics
                    115          181   PLoS ONE
                    116          179   The FASEB Journal
                    117          178   Investigative Ophthalmology and Visual Science
                    118          177   Molecular and Cellular Endocrinology
                    119          172   Archives of Virology
                    120          169   Archives of Microbiology
                    121          165   Molecular Immunology
                    122          165   Insect Biochemistry and Molecular Biology
                    123          165   American Journal of Medical Genetics
                    124          165   Molecular Phylogenetics and Evolution
                    125          159   DNA
                    126          156   American Journal of Medical Genetics. Part A
                    127          155   Molecular Reproduction and Development
                    128          155   Diabetes
                    129          155   Glycobiology
                    130          153   Hemoglobin
                    131          152   Bioorganicheskaia Khimiia
                    132          151   Clinical Genetics
                    133          151   Journal of the American Chemical Society
                    134          148   Journal of Cellular Biochemistry
                    135          147   BMC Genomics
                    136          146   International Journal of Cancer
                    137          141   Molecular Genetics and Metabolism
                    138          140   Molecular and Cellular Neuroscience
                    139          138   Nature Immunology
                    140          138   Animal Genetics
                    141          137   General and Comparative Endocrinology
                    142          135   Biological Chemistry
                    143          134   Molecular Genetics and Genomics
                    144          131   British Journal of Haematology
                    145          129   Journal of Lipid Research
                    146          127   Proteomics
                    147          124   Journal of Medicinal Chemistry
                    148          124   Circulation Research
                    149          122   Protein Expression and Purification
                    150          122   Agricultural and Biological Chemistry
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       960597                 1.82                                         
                    Journal                            766116     399615      1.45       1                                 
                    Submitted to EMBL/GenBank/DDBJ     185854     171105      0.35       2                                 
                    Submitted to other databases         6542       6103      0.01       3                                 
                    Book citation                         646        632     <0.01       4                                 
                    Plant Gene Register                   566        554     <0.01       5                                 
                    Thesis                                402        399     <0.01       6                                 
                    Unpublished observations              293        289     <0.01       7                                 
                    Patent                                172        170     <0.01       8                                 
                    Worm Breeder's Gazette                  6          6     <0.01       9                                 
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 300939
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2296215                 4.36                                         
                    ALLERGEN                              498        498     <0.01      26                                 
                    ALTERNATIVE PRODUCTS                19790      19790      0.04      13                                 
                    BIOPHYSICOCHEMICAL PROPERTIES        3536       3536      0.01      23                                 
                    BIOTECHNOLOGY                         298        296     <0.01      28                                 
                    CATALYTIC ACTIVITY                 232366     211864      0.44       4                                 
                    CAUTION                              7510       7358      0.01      19                                 
                    COFACTOR                           102612      94331      0.19       7                                 
                    DEVELOPMENTAL STAGE                  9215       9215      0.02      17                                 
                    DISEASE                              4495       3043      0.01      21                                 
                    DISRUPTION PHENOTYPE                 3540       3540      0.01      22                                 
                    DOMAIN                              34305      30350      0.07      11                                 
                    ENZYME REGULATION                    9537       9537      0.02      16                                 
                    FUNCTION                           397187     380670      0.75       2                                 
                    INDUCTION                           12943      12943      0.02      14                                 
                    INTERACTION                         12814      12814      0.02      15                                 
                    MASS SPECTROMETRY                    4798       3646      0.01      20                                 
                    MISCELLANEOUS                       30841      28468      0.06      12                                 
                    PATHWAY                            129151     117832      0.25       6                                 
                    PHARMACEUTICAL                         84         84     <0.01      29                                 
                    POLYMORPHISM                          824        784     <0.01      24                                 
                    PTM                                 38106      30619      0.07       9                                 
                    RNA EDITING                           619        619     <0.01      25                                 
                    SEQUENCE CAUTION                    39131      39131      0.07       8                                 
                    SIMILARITY                         620061     502192      1.18       1                                 
                    SUBCELLULAR LOCATION               309577     304214      0.59       3                                 
                    SUBUNIT                            227967     227967      0.43       5                                 
                    TISSUE SPECIFICITY                  35321      35321      0.07      10                                 
                    TOXIC DOSE                            461        448     <0.01      27                                 
                    WEB RESOURCE                         8628       6910      0.02      18                                 
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3375524                 6.41                                         
                    ACT_SITE                           132317      79685      0.25       9                                 
                    BINDING                            220775      61887      0.42       4                                 
                    CA_BIND                              3770       1557      0.01      35                                 
                    CARBOHYD                           102657      26032      0.19      13                                 
                    CHAIN                              533474     521388      1.01       1                                 
                    COILED                              18959      12954      0.04      26                                 
                    COMPBIAS                            51268      26873      0.10      18                                 
                    CONFLICT                           119340      41834      0.23      11                                 
                    CROSSLNK                             6001       3594      0.01      34                                 
                    DISULFID                           100424      27063      0.19      15                                 
                    DNA_BIND                            11115      10239      0.02      30                                 
                    DOMAIN                             153461      91660      0.29       6                                 
                    HELIX                              142709      14913      0.27       7                                 
                    INIT_MET                            14962      14962      0.03      27                                 
                    INTRAMEM                             1869        806     <0.01      38                                 
                    LIPID                               10857       6892      0.02      31                                 
                    METAL                              287358      70704      0.55       3                                 
                    MOD_RES                            183729      60716      0.35       5                                 
                    MOTIF                               33119      21340      0.06      24                                 
                    MUTAGEN                             34522       8157      0.07      22                                 
                    NON_CONS                             1937        725     <0.01      37                                 
                    NON_STD                               351        276     <0.01      39                                 
                    NON_TER                             11972       9104      0.02      29                                 
                    NP_BIND                            109469      69615      0.21      12                                 
                    PEPTIDE                              9512       6350      0.02      32                                 
                    PROPEP                              12029      10316      0.02      28                                 
                    REGION                             100955      54287      0.19      14                                 
                    REPEAT                              90531      13387      0.17      16                                 
                    SIGNAL                              36371      36361      0.07      21                                 
                    SITE                                39018      23095      0.07      20                                 
                    STRAND                             140986      13881      0.27       8                                 
                    TOPO_DOM                           121743      24821      0.23      10                                 
                    TRANSIT                              7460       7373      0.01      33                                 
                    TRANSMEM                           344289      70518      0.65       2                                 
                    TURN                                33333      11680      0.06      23                                 
                    UNSURE                               2500        491     <0.01      36                                 
                    VAR_SEQ                             40149      17271      0.08      19                                 
                    VARIANT                             81390      16614      0.15      17                                 
                    ZN_FING                             28843      12509      0.05      25                                 
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               14270110                27.08                                                           
                    2DBase-Ecoli                           85         85     <0.01     122      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     119      2D gel databases                             
                    AGD                                   881        875     <0.01      98      Organism-specific databases                  
                    Allergome                            1331        806     <0.01      93      Protein family/group databases               
                    ANU-2DPAGE                             23         23     <0.01     128      2D gel databases                             
                    ArachnoServer                         759        755     <0.01     100      Organism-specific databases                  
                    ArrayExpress                        58563      58563      0.11      43      Gene expression databases                    
                    Bgee                                40034      40026      0.08      47      Gene expression databases                    
                    BindingDB                             297        297     <0.01     115      Other                                        
                    BioCyc                             252288     243672      0.48      19      Enzyme and pathway databases                 
                    BRENDA                              65259      62459      0.12      41      Enzyme and pathway databases                 
                    CAZy                                 7257       6512      0.01      71      Protein family/group databases               
                    CGD                                   596        586     <0.01     104      Organism-specific databases                  
                    CleanEx                             30159      29509      0.06      49      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     121      2D gel databases                             
                    ConoServer                            613        587     <0.01     103      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     123      2D gel databases                             
                    CTD                                 65804      65251      0.12      39      Organism-specific databases                  
                    CYGD                                 6638       6555      0.01      73      Organism-specific databases                  
                    dictyBase                            4030       4030      0.01      86      Organism-specific databases                  
                    DIP                                 12505      12383      0.02      64      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     110      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     118      2D gel databases                             
                    DrugBank                             5318       1627      0.01      76      Other                                        
                    EchoBASE                             4167       4163      0.01      85      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     112      2D gel databases                             
                    EcoGene                              4291       4289      0.01      84      Organism-specific databases                  
                    eggNOG                             219029     219029      0.42      20      Phylogenomic databases                       
                    EMBL                               882588     516735      1.67       3      Sequence databases                           
                    Ensembl                             75181      58342      0.14      34      Genome annotation databases                  
                    EnsemblBacteria                     98007      84789      0.19      29      Genome annotation databases                  
                    EnsemblFungi                        14947      14799      0.03      62      Genome annotation databases                  
                    EnsemblMetazoa                      11318       8699      0.02      65      Genome annotation databases                  
                    EnsemblPlants                       15654      13598      0.03      60      Genome annotation databases                  
                    EnsemblProtists                      4433       4316      0.01      83      Genome annotation databases                  
                    euHCVdb                                55         44     <0.01     124      Organism-specific databases                  
                    EuPathDB                              305        305     <0.01     114      Organism-specific databases                  
                    FlyBase                              5776       5402      0.01      75      Organism-specific databases                  
                    Gene3D                             312979     242740      0.59      16      Family and domain databases                  
                    GeneCards                           20247      19685      0.04      54      Organism-specific databases                  
                    GeneDB_Spombe                        4978       4934      0.01      78      Organism-specific databases                  
                    GeneFarm                             2926       2912      0.01      89      Organism-specific databases                  
                    GeneID                             469403     449725      0.89       6      Genome annotation databases                  
                    GeneTree                           167874     167832      0.32      22      Phylogenomic databases                       
                    Genevestigator                      65678      65678      0.12      40      Gene expression databases                    
                    GenoList                             7052       7040      0.01      72      Organism-specific databases                  
                    GenomeReviews                      372606     352813      0.71      10      Genome annotation databases                  
                    GermOnline                          41918      41300      0.08      46      Gene expression databases                    
                    GlycoSuiteDB                          272        272     <0.01     116      PTM databases                                
                    GO                                2138909     493951      4.06       1      Ontologies                                   
                    Gramene                              4559       4559      0.01      82      Organism-specific databases                  
                    H-InvDB                             13205      12308      0.03      63      Organism-specific databases                  
                    HAMAP                              309129     308983      0.59      17      Family and domain databases                  
                    HGNC                                19711      19541      0.04      56      Organism-specific databases                  
                    HOGENOM                            363107     363107      0.69      12      Phylogenomic databases                       
                    HOVERGEN                            74768      74768      0.14      35      Phylogenomic databases                       
                    HPA                                 11292       8331      0.02      66      Organism-specific databases                  
                    HSSP                                29689      29689      0.06      50      3D structure databases                       
                    InParanoid                          67417      67417      0.13      38      Phylogenomic databases                       
                    IntAct                              24543      24543      0.05      52      Protein-protein interaction databases        
                    InterPro                          1731309     502129      3.29       2      Family and domain databases                  
                    IPI                                 91037      65120      0.17      31      Sequence databases                           
                    KEGG                               444822     423492      0.84       8      Genome annotation databases                  
                    LegioList                             761        759     <0.01      99      Organism-specific databases                  
                    Leproma                               671        668     <0.01     102      Organism-specific databases                  
                    MaizeGDB                              477        473     <0.01     107      Organism-specific databases                  
                    MEROPS                              10499      10168      0.02      67      Protein family/group databases               
                    MGI                                 16256      16211      0.03      59      Organism-specific databases                  
                    MIM                                 16595      12944      0.03      58      Organism-specific databases                  
                    MINT                                17530      17530      0.03      57      Protein-protein interaction databases        
                    NextBio                             48940      48938      0.09      44      Other                                        
                    neXtProt                            20063      20062      0.04      55      Organism-specific databases                  
                    NMPDR                              131878     131873      0.25      26      Genome annotation databases                  
                    OGP                                   377        377     <0.01     111      2D gel databases                             
                    OMA                                371218     371218      0.70      11      Phylogenomic databases                       
                    Orphanet                             3759       2285      0.01      87      Organism-specific databases                  
                    OrthoDB                             76630      76574      0.15      33      Phylogenomic databases                       
                    PANTHER                            157606     150301      0.30      23      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      81      Enzyme and pathway databases                 
                    PDB                                 74564      16739      0.14      37      3D structure databases                       
                    PDBsum                              74564      16739      0.14      36      3D structure databases                       
                    PeptideAtlas                         5166       5166      0.01      77      Proteomic databases                          
                    PeroxiBase                            739        728     <0.01     101      Protein family/group databases               
                    Pfam                               698834     489214      1.33       4      Family and domain databases                  
                    PharmGKB                            15420      15113      0.03      61      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     117      2D gel databases                             
                    PhosphoSite                         23762      23762      0.05      53      PTM databases                                
                    PhosSite                              351        351     <0.01     113      PTM databases                                
                    PhylomeDB                          122916     122916      0.23      27      Phylogenomic databases                       
                    PIR                                116515     106496      0.22      28      Sequence databases                           
                    PIRSF                               86232      86232      0.16      32      Family and domain databases                  
                    PMAP-CutDB                           1394       1394     <0.01      92      Other                                        
                    PMMA-2DPAGE                            52         52     <0.01     125      2D gel databases                             
                    PptaseDB                               34         34     <0.01     126      Protein family/group databases               
                    PRIDE                               60747      60747      0.12      42      Proteomic databases                          
                    PRINTS                             137159     118794      0.26      25      Family and domain databases                  
                    ProDom                              27799      27620      0.05      51      Family and domain databases                  
                    ProMEX                                474        474     <0.01     108      Proteomic databases                          
                    PROSITE                            468286     297320      0.89       7      Family and domain databases                  
                    ProtClustDB                        339838     339838      0.64      14      Phylogenomic databases                       
                    ProteinModelPortal                 416608     416608      0.79       9      3D structure databases                       
                    PseudoCAP                            1223       1214     <0.01      95      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     127      2D gel databases                             
                    Reactome                             8997       5318      0.02      69      Enzyme and pathway databases                 
                    REBASE                                441        398     <0.01     109      Protein family/group databases               
                    RefSeq                             492731     449985      0.94       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1255       1034     <0.01      94      2D gel databases                             
                    RGD                                  7504       7500      0.01      70      Organism-specific databases                  
                    SGD                                  6638       6572      0.01      74      Organism-specific databases                  
                    Siena-2DPAGE                          102        102     <0.01     120      2D gel databases                             
                    SMART                              155823     118581      0.30      24      Family and domain databases                  
                    SMR                                349273     349273      0.66      13      3D structure databases                       
                    STRING                             206260     206226      0.39      21      Protein-protein interaction databases        
                    SUPFAM                             319573     253362      0.61      15      Family and domain databases                  
                    SWISS-2DPAGE                         1184       1183     <0.01      96      2D gel databases                             
                    TAIR                                10285      10199      0.02      68      Organism-specific databases                  
                    TCDB                                 3531       3519      0.01      88      Protein family/group databases               
                    TIGR                                34247      33476      0.06      48      Genome annotation databases                  
                    TIGRFAMs                           285402     265241      0.54      18      Family and domain databases                  
                    TubercuList                          1802       1766     <0.01      91      Organism-specific databases                  
                    UCD-2DPAGE                            511        502     <0.01     106      2D gel databases                             
                    UCSC                                48681      39681      0.09      45      Genome annotation databases                  
                    UniGene                             92935      85406      0.18      30      Sequence databases                           
                    VectorBase                            532        519     <0.01     105      Genome annotation databases                  
                    World-2DPAGE                          916        905     <0.01      97      2D gel databases                             
                    WormBase                             4641       3818      0.01      79      Organism-specific databases                  
                    Xenbase                              4604       4582      0.01      80      Organism-specific databases                  
                    ZFIN                                 2650       2638      0.01      90      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 128
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.27   Gln (Q) 3.93   Leu (L) 9.66   Ser (S) 6.53
                    Arg (R) 5.53   Glu (E) 6.75   Lys (K) 5.85   Thr (T) 5.33
                    Asn (N) 4.06   Gly (G) 7.09   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.97   Pro (P) 4.69   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4445 entries are encoded on a mitochondrion, and 3608 are encoded on a plasmid.
                    
                    12183 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11619 on chloroplasts, 
                    50 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 70679