Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
UniProtKB/Swiss-Prot protein knowledgebase release 2011_03 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2011_03 of 08-Mar-11 of UniProtKB/Swiss-Prot contains 525997 sequence entries,
                    comprising 185874894 amino acids abstracted from 196176 references. 
                    
                    890 sequences have been added since release 2011_02, the sequence data of
                    83 existing entries has been updated and the annotations of
                    163980 entries have been revised.
                    
                    Number of fragments: 8826
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 29994
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        72361   13.8%
                    2: Evidence at transcript level     68581     13%
                    3: Inferred from homology          368871   70.1%
                    4: Predicted                        14336    2.7%
                    5: Uncertain                         1848    0.4%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12414
                    
                    The first twenty species represent 109676 sequences:  20.9 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5303
                    2x: 1774
                    3x:  938
                    4x:  605
                    5x:  444
                    6x:  354
                    7x:  259
                    8x:  214
                    9x:  193
                    10x:  105
                    11- 20x:  621
                    21- 50x:  393
                    51-100x:  187
                    >100x: 1024
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20233  Homo sapiens (Human)
                    2      16344  Mus musculus (Mouse)
                    3      10182  Arabidopsis thaliana (Mouse-ear cress)
                    4       7594  Rattus norvegicus (Rat)
                    5       6596  Saccharomyces cerevisiae (Baker's yeast)
                    6       5826  Bos taurus (Bovine)
                    7       4976  Schizosaccharomyces pombe (Fission yeast)
                    8       4430  Escherichia coli (strain K12)
                    9       4244  Bacillus subtilis
                    10       4181  Dictyostelium discoideum (Slime mold)
                    11       3330  Caenorhabditis elegans
                    12       3294  Xenopus laevis (African clawed frog)
                    13       3108  Drosophila melanogaster (Fruit fly)
                    14       2716  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2656  Oryza sativa subsp. japonica (Rice)
                    16       2213  Pongo abelii (Sumatran orangutan)
                    17       2196  Gallus gallus (Chicken)
                    18       1997  Escherichia coli O157:H7
                    19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
                    20       1778  Salmonella typhimurium
                    21       1771  Haemophilus influenzae
                    22       1746  Mycobacterium tuberculosis
                    23       1674  Shigella flexneri
                    24       1672  Escherichia coli O6
                    25       1589  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1382  Sus scrofa (Pig)
                    27       1342  Salmonella typhi
                    28       1284  Pseudomonas aeruginosa
                    29       1224  Mycobacterium bovis
                    30       1165  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
                    31       1024  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / N-1)
                    32       1000  Yersinia pestis
                    33        993  Archaeoglobus fulgidus
                    34        946  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        913  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        897  Staphylococcus aureus (strain COL)
                    41        895  Staphylococcus aureus (strain MW2)
                    42        889  Staphylococcus aureus (strain MSSA476)
                    43        886  Staphylococcus aureus (strain MRSA252)
                    44        885  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    45        884  Oryctolagus cuniculus (Rabbit)
                    46        881  Salmonella choleraesuis
                    47        876  Shigella sonnei (strain Ss046)
                    48        870  Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056)  
                    49        864  Yersinia pseudotuberculosis
                    50        845  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    51        841  Escherichia coli O9:H4 (strain HS)
                    52        834  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    53        828  Candida albicans (Yeast)
                    54        826  Shigella boydii serotype 4 (strain Sb227)
                    55        823  Escherichia coli (strain UTI89 / UPEC)
                    56        819  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    57        808  Shigella dysenteriae serotype 1 (strain Sd197)
                    58        802  Candida glabrata (Yeast) (Torulopsis glabrata)
                    59        794  Vibrio parahaemolyticus
                    60        791  Neurospora crassa
                    61        790  Escherichia coli (strain SMS-3-5 / SECEC)
                    62        779  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    63        778  Pasteurella multocida
                    64        775  Canis familiaris (Dog) (Canis lupus familiaris)
                    65        773  Aquifex aeolicus
                    66        770  Escherichia coli (strain K12 / DH10B)
                    67        764  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    68        764  Escherichia coli (strain K12 / MC4100 / BW2952)
                    69        762  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    70        762  Escherichia coli (strain 55989 / EAEC)
                    71        761  Escherichia coli O8 (strain IAI1)
                    72        757  Shigella flexneri serotype 5b (strain 8401)
                    73        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    74        756  Escherichia coli (strain SE11)
                    75        756  Staphylococcus epidermidis (strain ATCC 12228)
                    76        756  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    77        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    78        751  Streptomyces coelicolor
                    79        746  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    80        741  Photorhabdus luminescens subsp. laumondii
                    81        732  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    82        731  Escherichia coli O81 (strain ED1a)
                    83        731  Bacillus halodurans
                    84        731  Vibrio vulnificus
                    85        730  Emericella nidulans (Aspergillus nidulans)
                    86        724  Bacillus anthracis
                    87        720  Salmonella enteritidis PT4 (strain P125109)
                    88        716  Vibrio vulnificus (strain YJ016)
                    89        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    90        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    91        713  Staphylococcus aureus (strain NCTC 8325)
                    92        713  Salmonella paratyphi A (strain AKU_12601)
                    93        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    94        711  Salmonella agona (strain SL483)
                    95        711  Escherichia coli O1:K1 / APEC
                    96        711  Salmonella newport (strain SL254)
                    97        710  Salmonella heidelberg (strain SL476)
                    98        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    99        709  Salmonella schwarzengrund (strain CVM19633)
                    100        706  Enterobacter sp. (strain 638)
                    101        705  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    102        700  Salmonella dublin (strain CT_02021853)
                    103        697  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        691  Klebsiella pneumoniae (strain 342)
                    105        687  Mycoplasma pneumoniae
                    106        686  Pan troglodytes (Chimpanzee)
                    107        686  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    108        685  Nostoc sp. (strain PCC 7120 / UTEX 2576)
                    109        684  Pseudomonas syringae pv. tomato
                    110        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    111        672  Pseudomonas putida (strain KT2440)
                    112        670  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    113        668  Mycobacterium leprae
                    114        667  Zea mays (Maize)
                    115        666  Staphylococcus aureus (strain USA300)
                    116        666  Yersinia pestis (strain Pestoides F)
                    117        662  Serratia proteamaculans (strain 568)
                    118        658  Rhizobium sp. (strain NGR234)
                    119        650  Bradyrhizobium japonicum
                    120        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    121        640  Escherichia coli
                    122        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    123        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    124        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    125        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    126        626  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    127        626  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    128        621  Shewanella oneidensis
                    129        615  Treponema pallidum
                    130        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    131        613  Yarrowia lipolytica (Candida lipolytica)
                    132        613  Enterobacter sakazakii (strain ATCC BAA-894)
                    133        610  Staphylococcus haemolyticus (strain JCSC1435)
                    134        605  Rhizobium loti (Mesorhizobium loti)
                    135        603  Methanobacterium thermoautotrophicum
                    136        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    137        600  Yersinia pestis bv. Antiqua (strain Angola)
                    138        600  Salmonella paratyphi C (strain RKS4594)
                    139        598  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    140        597  Listeria monocytogenes
                    141        591  Aspergillus fumigatus (Sartorya fumigata)
                    142        590  Bacillus cereus (strain ATCC 10987)
                    143        590  Xanthomonas campestris pv. campestris
                    144        588  Listeria innocua
                    145        585  Rickettsia prowazekii
                    146        585  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    147        584  Helicobacter pylori (Campylobacter pylori)
                    148        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    149        579  Neisseria meningitidis serogroup B
                    150        576  Brucella suis
                    151        572  Brucella melitensis
                    152        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    153        569  Bacillus thuringiensis subsp. konkukian
                    154        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    155        565  Pseudomonas syringae pv. syringae (strain B728a)
                    156        562  Buchnera aphidicola subsp. Schizaphis graminum
                    157        560  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    158        560  Bacillus cereus (strain ZK / E33L)
                    159        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    160        556  Neisseria meningitidis serogroup A
                    161        556  Clostridium acetobutylicum
                    162        556  Xanthomonas axonopodis pv. citri (Citrus canker)
                    163        554  Vibrio fischeri (strain ATCC 700601 / ES114)
                    164        553  Oryza sativa subsp. indica (Rice)
                    165        552  Pseudomonas fluorescens (strain Pf0-1)
                    166        551  Caenorhabditis briggsae
                    167        549  Oceanobacillus iheyensis
                    168        547  Caulobacter crescentus (Caulobacter vibrioides)
                    169        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    170        543  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    171        529  Listeria monocytogenes serotype 4b (strain F2365)
                    172        525  Sodalis glossinidius (strain morsitans)
                    173        525  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    174        522  Xylella fastidiosa
                    175        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    176        519  Streptococcus pneumoniae
                    177        513  Chromobacterium violaceum
                    178        512  Thermotoga maritima
                    179        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    180        510  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    181        507  Bordetella parapertussis
                    182        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    183        507  Pseudomonas aeruginosa (strain PA7)
                    184        507  Haemophilus ducreyi
                    185        506  Bordetella pertussis
                    186        505  Staphylococcus aureus (strain Newman)
                    187        504  Geobacillus kaustophilus
                    188        500  Pseudomonas entomophila (strain L48)
                    189        499  Deinococcus radiodurans
                    190        498  Brucella abortus
                    191        497  Rickettsia conorii
                    192        496  Bacillus clausii (strain KSM-K16)
                    193        493  Corynebacterium glutamicum (Brevibacterium flavum)
                    194        493  Streptomyces avermitilis
                    195        492  Haemophilus influenzae (strain 86-028NP)
                    196        491  Xanthomonas campestris pv. campestris (strain 8004)
                    197        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    198        490  Clostridium perfringens
                    199        488  Bacillus amyloliquefaciens (strain FZB42)
                    200        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    201        487  Shewanella sp. (strain MR-7)
                    202        484  Mannheimia succiniciproducens (strain MBEL55E)
                    203        484  Pseudomonas aeruginosa (strain LESB58)
                    204        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    205        484  Shewanella sp. (strain MR-4)
                    206        483  Proteus mirabilis (strain HI4320)
                    207        483  Mycoplasma genitalium
                    208        482  Methanosarcina acetivorans
                    209        475  Acinetobacter sp. (strain ADP1)
                    210        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    211        474  Thermosynechococcus elongatus (strain BP-1)
                    212        473  Pseudomonas putida (strain F1 / ATCC 700007)
                    213        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    214        472  Brucella abortus (strain 2308)
                    215        469  Enterococcus faecalis (Streptococcus faecalis)
                    216        467  Pyrococcus horikoshii
                    217        466  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    218        465  Rhodopseudomonas palustris
                    219        465  Pseudomonas putida (strain GB-1)
                    220        464  Shewanella frigidimarina (strain NCIMB 400)
                    221        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    222        462  Shewanella sp. (strain ANA-3)
                    223        461  Burkholderia mallei (Pseudomonas mallei)
                    224        461  Lactobacillus plantarum
                    225        461  Methanosarcina mazei (Methanosarcina frisia)
                    226        460  Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337) 
                    227        459  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    228        459  Pyrococcus abyssi
                    229        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    230        457  Cupriavidus pinatubonensis (strain JMP134 / LMG 1197) (Alcaligenes eutrophus) 
                    231        455  Staphylococcus aureus (strain JH1)
                    232        454  Halobacterium salinarium (Halobacterium halobium)
                    233        454  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    234        453  Rickettsia felis (Rickettsia azadi)
                    235        452  Shewanella baltica (strain OS185)
                    236        452  Pseudomonas putida (strain W619)
                    237        451  Ovis aries (Sheep)
                    238        449  Staphylococcus aureus (strain JH9)
                    239        449  Methylococcus capsulatus
                    240        449  Streptococcus mutans
                    241        449  Aeromonas salmonicida (strain A449)
                    242        448  Thermoanaerobacter tengcongensis
                    243        448  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    244        447  Vibrio fischeri (strain MJ11)
                    245        446  Mycobacterium paratuberculosis
                    246        444  Hahella chejuensis (strain KCTC 2396)
                    247        444  Pseudomonas mendocina (strain ymp)
                    248        443  Dechloromonas aromatica (strain RCB)
                    249        442  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
                    250        441  Streptococcus pyogenes serotype M6
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18401 (  3%)
                    Bacteria         325442 ( 62%)
                    Eukaryota        166671 ( 32%)
                    Viruses           15483 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20234 ( 12%)           (  4%)
                    Other Mammalia         45048 ( 27%)           (  9%)
                    Other Vertebrata       16545 ( 10%)           (  3%)
                    Viridiplantae          30765 ( 18%)           (  6%)
                    Fungi                  27737 ( 17%)           (  5%)
                    Insecta                 8213 (  5%)           (  2%)
                    Nematoda                4183 (  3%)           (  1%)
                    Other                  13946 (  8%)           (  3%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8649             1001-1100     3592
                    51- 100   40566             1101-1200     2487
                    101- 150   56548             1201-1300     1966
                    151- 200   56561             1301-1400     1814
                    201- 250   55381             1401-1500     1439
                    251- 300   48598             1501-1600      642
                    301- 350   49042             1601-1700      520
                    351- 400   42203             1701-1800      435
                    401- 450   34608             1801-1900      403
                    451- 500   27726             1901-2000      329
                    501- 550   19610             2001-2100      203
                    551- 600   14015             2101-2200      270
                    601- 650   11785             2201-2300      281
                    651- 700    8504             2301-2400      168
                    701- 750    7026             2401-2500      130
                    751- 800    5015             >2500         1041
                    801- 850    4369
                    851- 900    4906
                    901- 950    3719
                    951-1000    2620
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 353 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2119
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  688
                    2x:  276
                    3x:  152
                    4x:  100
                    5x:   94
                    6x:   69
                    7x:   36
                    8x:   41
                    9x:   31
                    10x:   23
                    11- 20x:  172
                    21- 50x:  174
                    51-100x:   99
                    >100x:  164
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18574   Journal of Biological Chemistry
                    2         8601   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5150   Journal of Bacteriology
                    4         4643   Biochemical and Biophysical Research Communications
                    5         4530   Gene
                    6         4347   Nucleic Acids Research
                    7         4053   FEBS Letters
                    8         4020   Biochemistry
                    9         3855   The EMBO Journal
                    10         3529   Molecular and Cellular Biology
                    11         3357   Nature
                    12         3189   Journal of Molecular Biology
                    13         3136   European Journal of Biochemistry
                    14         3021   Biochimica et Biophysica Acta
                    15         2783   Cell
                    16         2484   Genomics
                    17         2228   Biochemical Journal
                    18         2197   Science
                    19         2112   Journal of Virology
                    20         1819   Molecular Microbiology
                    21         1628   Journal of Cell Biology
                    22         1529   Plant Molecular Biology
                    23         1455   Plant Physiology
                    24         1429   Genes and Development
                    25         1391   Virology
                    26         1356   Human Molecular Genetics
                    27         1354   Nature Genetics
                    28         1330   The American Journal of Human Genetics
                    29         1309   Molecular and General Genetics
                    30         1232   Oncogene
                    31         1206   Development
                    32         1182   Journal of Biochemistry
                    33         1142   Human Mutation
                    34         1075   Molecular Biology of the Cell
                    35         1037   Journal of Immunology
                    36         1019   Genetics
                    37         1014   The Plant Cell
                    38          931   Journal of General Virology
                    39          927   Structure
                    40          901   Molecular Cell
                    41          893   Infection and Immunity
                    42          857   The Plant Journal
                    43          842   Archives of Biochemistry and Biophysics
                    44          820   Blood
                    45          775   Microbiology
                    46          774   Journal of Cell Science
                    47          764   Yeast
                    48          750   Developmental Biology
                    49          689   Current Biology
                    50          689   Cancer Research
                    51          676   FEMS Microbiology Letters
                    52          603   Nature Structural Biology
                    53          599   Mechanisms of Development
                    54          597   Human Genetics
                    55          579   Acta Crystallographica, Section D
                    56          572   Protein Science
                    57          561   Applied and Environmental Microbiology
                    58          552   Journal of Neuroscience
                    59          534   Toxicon
                    60          528   Current Genetics
                    61          522   Neuron
                    62          517   Journal of Clinical Investigation
                    63          474   American Journal of Physiology
                    64          473   Mammalian Genome
                    65          455   The Journal of Experimental Medicine
                    66          451   Immunogenetics
                    67          449   Molecular Endocrinology
                    68          424   Molecular and Biochemical Parasitology
                    69          417   Journal of Neurochemistry
                    70          412   The Journal of Clinical Endocrinology and Metabolism
                    71          407   Endocrinology
                    72          394   Proteins
                    73          382   Journal of Molecular Evolution
                    74          376   Bioscience, Biotechnology, and Biochemistry
                    75          367   DNA and Cell Biology
                    76          363   Journal of Medical Genetics
                    77          362   Molecular Biology and Evolution
                    78          358   DNA Sequence
                    79          349   Plant and Cell Physiology
                    80          331   Nature Cell Biology
                    81          321   Tissue Antigens
                    82          316   Peptides
                    83          315   Experimental Cell Research
                    84          315   Brain Research. Molecular Brain Research
                    85          301   Comparative Biochemistry and Physiology
                    86          291   Biological Chemistry Hoppe-Seyler
                    87          290   Antimicrobial Agents and Chemotherapy
                    88          285   Journal of Investigative Dermatology
                    89          276   Cytogenetics and Cell Genetics
                    90          271   Molecular Pharmacology
                    91          267   Biology of Reproduction
                    92          267   Developmental Cell
                    93          256   RNA
                    94          254   Genome Research
                    95          250   Virus Research
                    96          250   Neurology
                    97          248   Journal of General Microbiology
                    98          246   Developmental Dynamics
                    99          233   Molecular Plant-Microbe Interactions
                    100          231   Planta
                    101          218   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    102          213   Nature Structural and Molecular Biology
                    103          209   Annals of Neurology
                    104          209   Genes to Cells
                    105          208   European Journal of Immunology
                    106          207   Biochimie
                    107          207   DNA Research
                    108          204   Immunity
                    109          202   Eukaryotic cell
                    110          202   The FEBS Journal
                    111          199   The New England Journal of Medicine
                    112          197   European Journal of Human Genetics
                    113          189   Journal of Human Genetics
                    114          186   EMBO Reports
                    115          178   Investigative Ophthalmology and Visual Science
                    116          178   The FASEB Journal
                    117          177   Molecular and Cellular Endocrinology
                    118          173   PLoS ONE
                    119          169   Archives of Microbiology
                    120          169   Archives of Virology
                    121          165   American Journal of Medical Genetics
                    122          165   Molecular Phylogenetics and Evolution
                    123          164   Insect Biochemistry and Molecular Biology
                    124          162   Molecular Immunology
                    125          159   DNA
                    126          156   American Journal of Medical Genetics. Part A
                    127          155   Molecular Reproduction and Development
                    128          153   Diabetes
                    129          153   Glycobiology
                    130          153   Hemoglobin
                    131          152   Bioorganicheskaia Khimiia
                    132          150   Clinical Genetics
                    133          150   Journal of the American Chemical Society
                    134          147   Journal of Cellular Biochemistry
                    135          146   BMC Genomics
                    136          146   International Journal of Cancer
                    137          141   Molecular Genetics and Metabolism
                    138          139   Molecular and Cellular Neuroscience
                    139          138   Animal Genetics
                    140          137   General and Comparative Endocrinology
                    141          137   Nature Immunology
                    142          133   Molecular Genetics and Genomics
                    143          133   Biological Chemistry
                    144          131   British Journal of Haematology
                    145          129   Journal of Lipid Research
                    146          127   Proteomics
                    147          124   Circulation Research
                    148          123   Journal of Medicinal Chemistry
                    149          122   Agricultural and Biological Chemistry
                    150          121   Protein Expression and Purification
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       957990                 1.82                                         
                    Journal                            759485     398759      1.44       1                                 
                    Submitted to EMBL/GenBank/DDBJ     185633     170918      0.35       2                                 
                    Submitted to other databases        10787       9334      0.02       3                                 
                    Book citation                         646        632     <0.01       4                                 
                    Plant Gene Register                   566        554     <0.01       5                                 
                    Thesis                                402        399     <0.01       6                                 
                    Unpublished observations              293        289     <0.01       7                                 
                    Patent                                172        170     <0.01       8                                 
                    Worm Breeder's Gazette                  6          6     <0.01       9                                 
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 299909
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2287862                 4.35                                         
                    ALLERGEN                              498        498     <0.01      26                                 
                    ALTERNATIVE PRODUCTS                19742      19742      0.04      13                                 
                    BIOPHYSICOCHEMICAL PROPERTIES        3509       3509      0.01      22                                 
                    BIOTECHNOLOGY                         296        294     <0.01      28                                 
                    CATALYTIC ACTIVITY                 231219     210974      0.44       4                                 
                    CAUTION                              7485       7334      0.01      19                                 
                    COFACTOR                           102239      93971      0.19       7                                 
                    DEVELOPMENTAL STAGE                  9177       9177      0.02      17                                 
                    DISEASE                              4462       3020      0.01      21                                 
                    DISRUPTION PHENOTYPE                 3468       3468      0.01      23                                 
                    DOMAIN                              33939      29993      0.06      11                                 
                    ENZYME REGULATION                    9515       9515      0.02      16                                 
                    FUNCTION                           395943     379445      0.75       2                                 
                    INDUCTION                           12843      12843      0.02      14                                 
                    INTERACTION                         12814      12814      0.02      15                                 
                    MASS SPECTROMETRY                    4793       3642      0.01      20                                 
                    MISCELLANEOUS                       30774      28406      0.06      12                                 
                    PATHWAY                            128499     117422      0.24       6                                 
                    PHARMACEUTICAL                         84         84     <0.01      29                                 
                    POLYMORPHISM                          821        782     <0.01      24                                 
                    PTM                                 37998      30537      0.07       9                                 
                    RNA EDITING                           619        619     <0.01      25                                 
                    SEQUENCE CAUTION                    38997      38997      0.07       8                                 
                    SIMILARITY                         617830     501233      1.17       1                                 
                    SUBCELLULAR LOCATION               308713     303373      0.59       3                                 
                    SUBUNIT                            227289     227289      0.43       5                                 
                    TISSUE SPECIFICITY                  35219      35219      0.07      10                                 
                    TOXIC DOSE                            460        447     <0.01      27                                 
                    WEB RESOURCE                         8617       6900      0.02      18                                 
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3356196                 6.38                                         
                    ACT_SITE                           131548      79334      0.25       9                                 
                    BINDING                            219501      61599      0.42       4                                 
                    CA_BIND                              3770       1557      0.01      35                                 
                    CARBOHYD                           102332      25907      0.19      13                                 
                    CHAIN                              532487     520416      1.01       1                                 
                    COILED                              18856      12869      0.04      26                                 
                    COMPBIAS                            50994      26709      0.10      18                                 
                    CONFLICT                           119136      41773      0.23      11                                 
                    CROSSLNK                             5986       3591      0.01      34                                 
                    DISULFID                           100047      26901      0.19      15                                 
                    DNA_BIND                            11026      10153      0.02      30                                 
                    DOMAIN                             152378      91151      0.29       6                                 
                    HELIX                              138527      14477      0.26       7                                 
                    INIT_MET                            14961      14961      0.03      27                                 
                    INTRAMEM                             1869        806     <0.01      38                                 
                    LIPID                               10844       6885      0.02      31                                 
                    METAL                              285967      70342      0.54       3                                 
                    MOD_RES                            183447      60669      0.35       5                                 
                    MOTIF                               32993      21226      0.06      23                                 
                    MUTAGEN                             34145       8076      0.06      22                                 
                    NON_CONS                             1915        725     <0.01      37                                 
                    NON_STD                               351        276     <0.01      39                                 
                    NON_TER                             11960       9094      0.02      28                                 
                    NP_BIND                            108544      69373      0.21      12                                 
                    PEPTIDE                              9512       6350      0.02      32                                 
                    PROPEP                              11890      10181      0.02      29                                 
                    REGION                             100185      54052      0.19      14                                 
                    REPEAT                              90383      13381      0.17      16                                 
                    SIGNAL                              36160      36150      0.07      21                                 
                    SITE                                38930      23044      0.07      20                                 
                    STRAND                             137432      13488      0.26       8                                 
                    TOPO_DOM                           121586      24783      0.23      10                                 
                    TRANSIT                              7383       7296      0.01      33                                 
                    TRANSMEM                           344063      70425      0.65       2                                 
                    TURN                                32592      11370      0.06      24                                 
                    UNSURE                               2499        490     <0.01      36                                 
                    VAR_SEQ                             40042      17225      0.08      19                                 
                    VARIANT                             81171      16583      0.15      17                                 
                    ZN_FING                             28784      12488      0.05      25                                 
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               14209402                27.01                                                           
                    2DBase-Ecoli                           85         85     <0.01     122      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     119      2D gel databases                             
                    AGD                                   876        870     <0.01      98      Organism-specific databases                  
                    Allergome                            1320        799     <0.01      93      Protein family/group databases               
                    ANU-2DPAGE                             23         23     <0.01     128      2D gel databases                             
                    ArachnoServer                         759        755     <0.01     100      Organism-specific databases                  
                    ArrayExpress                        58527      58527      0.11      42      Gene expression databases                    
                    Bgee                                40008      40008      0.08      47      Gene expression databases                    
                    BindingDB                             297        297     <0.01     115      Other                                        
                    BioCyc                             252182     243567      0.48      19      Enzyme and pathway databases                 
                    BRENDA                              65252      62452      0.12      41      Enzyme and pathway databases                 
                    CAZy                                 7256       6511      0.01      71      Protein family/group databases               
                    CGD                                   591        582     <0.01     104      Organism-specific databases                  
                    CleanEx                             30159      29509      0.06      49      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     121      2D gel databases                             
                    ConoServer                            613        587     <0.01     103      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     123      2D gel databases                             
                    CTD                                 65750      65172      0.13      39      Organism-specific databases                  
                    CYGD                                 6638       6555      0.01      73      Organism-specific databases                  
                    dictyBase                            4060       4060      0.01      86      Organism-specific databases                  
                    DIP                                 12504      12382      0.02      64      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     110      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     118      2D gel databases                             
                    DrugBank                             5317       1626      0.01      76      Other                                        
                    EchoBASE                             4167       4163      0.01      85      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     112      2D gel databases                             
                    EcoGene                              4291       4289      0.01      84      Organism-specific databases                  
                    eggNOG                             218877     218877      0.42      20      Phylogenomic databases                       
                    EMBL                               880960     515770      1.67       3      Sequence databases                           
                    Ensembl                             74751      58215      0.14      34      Genome annotation databases                  
                    EnsemblBacteria                     98113      84892      0.19      29      Genome annotation databases                  
                    EnsemblFungi                        14913      14762      0.03      62      Genome annotation databases                  
                    EnsemblMetazoa                      10728       8443      0.02      66      Genome annotation databases                  
                    EnsemblPlants                       15561      13507      0.03      60      Genome annotation databases                  
                    EnsemblProtists                      4425       4307      0.01      83      Genome annotation databases                  
                    euHCVdb                                55         44     <0.01     124      Organism-specific databases                  
                    EuPathDB                              301        301     <0.01     114      Organism-specific databases                  
                    FlyBase                              5769       5395      0.01      75      Organism-specific databases                  
                    Gene3D                             302835     234549      0.58      17      Family and domain databases                  
                    GeneCards                           20403      19772      0.04      54      Organism-specific databases                  
                    GeneDB_Spombe                        4978       4934      0.01      78      Organism-specific databases                  
                    GeneFarm                             2925       2911      0.01      89      Organism-specific databases                  
                    GeneID                             463282     443757      0.88       7      Genome annotation databases                  
                    GeneTree                           167653     167613      0.32      22      Phylogenomic databases                       
                    Genevestigator                      65617      65617      0.12      40      Gene expression databases                    
                    GenoList                             7052       7040      0.01      72      Organism-specific databases                  
                    GenomeReviews                      380805     360721      0.72      10      Genome annotation databases                  
                    GermOnline                          41918      41300      0.08      46      Gene expression databases                    
                    GlycoSuiteDB                          272        272     <0.01     116      PTM databases                                
                    GO                                2128712     492688      4.05       1      Ontologies                                   
                    Gramene                              4558       4558      0.01      82      Organism-specific databases                  
                    H-InvDB                             13204      12307      0.03      63      Organism-specific databases                  
                    HAMAP                              308983     308838      0.59      16      Family and domain databases                  
                    HGNC                                19705      19534      0.04      56      Organism-specific databases                  
                    HOGENOM                            362836     362836      0.69      12      Phylogenomic databases                       
                    HOVERGEN                            74729      74729      0.14      35      Phylogenomic databases                       
                    HPA                                 11292       8331      0.02      65      Organism-specific databases                  
                    HSSP                                29614      29614      0.06      50      3D structure databases                       
                    InParanoid                          67343      67343      0.13      38      Phylogenomic databases                       
                    IntAct                              24516      24516      0.05      52      Protein-protein interaction databases        
                    InterPro                          1718692     501141      3.27       2      Family and domain databases                  
                    IPI                                 90848      65004      0.17      31      Sequence databases                           
                    KEGG                               442154     420932      0.84       8      Genome annotation databases                  
                    LegioList                             761        759     <0.01      99      Organism-specific databases                  
                    Leproma                               671        668     <0.01     102      Organism-specific databases                  
                    MaizeGDB                              477        473     <0.01     107      Organism-specific databases                  
                    MEROPS                              10319       9986      0.02      67      Protein family/group databases               
                    MGI                                 16246      16201      0.03      59      Organism-specific databases                  
                    MIM                                 16540      12924      0.03      58      Organism-specific databases                  
                    MINT                                17522      17522      0.03      57      Protein-protein interaction databases        
                    NextBio                             48922      48920      0.09      44      Other                                        
                    neXtProt                            20064      20064      0.04      55      Organism-specific databases                  
                    NMPDR                              131762     131757      0.25      26      Genome annotation databases                  
                    OGP                                   377        377     <0.01     111      2D gel databases                             
                    OMA                                370951     370951      0.71      11      Phylogenomic databases                       
                    Orphanet                             3759       2285      0.01      87      Organism-specific databases                  
                    OrthoDB                             76452      76396      0.15      33      Phylogenomic databases                       
                    PANTHER                            163559     155752      0.31      23      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      81      Enzyme and pathway databases                 
                    PDB                                 73662      16636      0.14      37      3D structure databases                       
                    PDBsum                              73662      16636      0.14      36      3D structure databases                       
                    PeptideAtlas                         5166       5166      0.01      77      Proteomic databases                          
                    PeroxiBase                            739        728     <0.01     101      Protein family/group databases               
                    Pfam                               696393     486665      1.32       4      Family and domain databases                  
                    PharmGKB                            15420      15113      0.03      61      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     117      2D gel databases                             
                    PhosphoSite                         20439      20439      0.04      53      PTM databases                                
                    PhosSite                              351        351     <0.01     113      PTM databases                                
                    PhylomeDB                          122828     122828      0.23      27      Phylogenomic databases                       
                    PIR                                116424     106410      0.22      28      Sequence databases                           
                    PIRSF                               85789      85789      0.16      32      Family and domain databases                  
                    PMAP-CutDB                           1394       1394     <0.01      92      Other                                        
                    PMMA-2DPAGE                            52         52     <0.01     125      2D gel databases                             
                    PptaseDB                               34         34     <0.01     126      Protein family/group databases               
                    PRIDE                               54447      54447      0.10      43      Proteomic databases                          
                    PRINTS                             137985     119625      0.26      25      Family and domain databases                  
                    ProDom                              27694      27515      0.05      51      Family and domain databases                  
                    ProMEX                                474        474     <0.01     108      Proteomic databases                          
                    PROSITE                            467191     296810      0.89       6      Family and domain databases                  
                    ProtClustDB                        339443     339443      0.65      14      Phylogenomic databases                       
                    ProteinModelPortal                 416006     416006      0.79       9      3D structure databases                       
                    PseudoCAP                            1223       1214     <0.01      95      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     127      2D gel databases                             
                    Reactome                             8997       5318      0.02      69      Enzyme and pathway databases                 
                    REBASE                                441        398     <0.01     109      Protein family/group databases               
                    RefSeq                             486271     444029      0.92       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1255       1034     <0.01      94      2D gel databases                             
                    RGD                                  7498       7494      0.01      70      Organism-specific databases                  
                    SGD                                  6638       6572      0.01      74      Organism-specific databases                  
                    Siena-2DPAGE                          102        102     <0.01     120      2D gel databases                             
                    SMART                              155223     118085      0.30      24      Family and domain databases                  
                    SMR                                349002     349002      0.66      13      3D structure databases                       
                    STRING                             206125     206102      0.39      21      Protein-protein interaction databases        
                    SUPFAM                             314337     248555      0.60      15      Family and domain databases                  
                    SWISS-2DPAGE                         1184       1183     <0.01      96      2D gel databases                             
                    TAIR                                10236      10150      0.02      68      Organism-specific databases                  
                    TCDB                                 3514       3502      0.01      88      Protein family/group databases               
                    TIGR                                34225      33454      0.07      48      Genome annotation databases                  
                    TIGRFAMs                           285672     265682      0.54      18      Family and domain databases                  
                    TubercuList                          1763       1727     <0.01      91      Organism-specific databases                  
                    UCD-2DPAGE                            511        502     <0.01     106      2D gel databases                             
                    UCSC                                48660      39663      0.09      45      Genome annotation databases                  
                    UniGene                             93128      85286      0.18      30      Sequence databases                           
                    VectorBase                            527        514     <0.01     105      Genome annotation databases                  
                    World-2DPAGE                          916        905     <0.01      97      2D gel databases                             
                    WormBase                             4640       3817      0.01      79      Organism-specific databases                  
                    Xenbase                              4603       4581      0.01      80      Organism-specific databases                  
                    ZFIN                                 2648       2636      0.01      90      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 128
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.27   Gln (Q) 3.93   Leu (L) 9.67   Ser (S) 6.53
                    Arg (R) 5.53   Glu (E) 6.76   Lys (K) 5.85   Thr (T) 5.33
                    Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.98   Pro (P) 4.69   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4445 entries are encoded on a mitochondrion, and 3601 are encoded on a plasmid.
                    
                    12183 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11619 on chloroplasts, 
                    50 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 70533