Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
                    UniProtKB/Swiss-Prot protein knowledgebase release 2011_05 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2011_05 of 03-May-11 of UniProtKB/Swiss-Prot contains 528048 sequence entries,
                    comprising 186939477 amino acids abstracted from 197735 references. 
                    
                    1114 sequences have been added since release 2011_04, the sequence data of
                    138 existing entries has been updated and the annotations of
                    92987 entries have been revised.
                    
                    Number of fragments: 8855
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 30207
                    
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        72952   13.8%
                    2: Evidence at transcript level     68710     13%
                    3: Inferred from homology          370166   70.1%
                    4: Predicted                        14362    2.7%
                    5: Uncertain                         1858    0.4%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12465
                    
                    The first twenty species represent 109995 sequences:  20.8 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5313
                    2x: 1779
                    3x:  948
                    4x:  612
                    5x:  449
                    6x:  355
                    7x:  258
                    8x:  212
                    9x:  194
                    10x:  105
                    11- 20x:  626
                    21- 50x:  389
                    51-100x:  200
                    >100x: 1025
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20238  Homo sapiens (Human)
                    2      16359  Mus musculus (Mouse)
                    3      10338  Arabidopsis thaliana (Mouse-ear cress)
                    4       7606  Rattus norvegicus (Rat)
                    5       6561  Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)
                    6       5838  Bos taurus (Bovine)
                    7       4976  Schizosaccharomyces pombe (strain ATCC 38366 / 972) (Fission yeast)
                    8       4430  Escherichia coli (strain K12)
                    9       4244  Bacillus subtilis
                    10       4161  Dictyostelium discoideum (Slime mold)
                    11       3331  Caenorhabditis elegans
                    12       3305  Xenopus laevis (African clawed frog)
                    13       3120  Drosophila melanogaster (Fruit fly)
                    14       2727  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2687  Oryza sativa subsp. japonica (Rice)
                    16       2214  Pongo abelii (Sumatran orangutan)
                    17       2204  Gallus gallus (Chicken)
                    18       1997  Escherichia coli O157:H7
                    19       1872  Mycobacterium tuberculosis
                    20       1787  Methanocaldococcus jannaschii  
                    21       1780  Salmonella typhimurium
                    22       1707  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
                    23       1674  Shigella flexneri
                    24       1672  Escherichia coli O6
                    25       1597  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1386  Sus scrofa (Pig)
                    27       1342  Salmonella typhi
                    28       1285  Pseudomonas aeruginosa
                    29       1243  Mycobacterium bovis
                    30       1166  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
                    31       1024  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / N-1)
                    32       1000  Yersinia pestis
                    33        997  Archaeoglobus fulgidus
                    34        946  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        913  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        897  Staphylococcus aureus (strain COL)
                    41        895  Staphylococcus aureus (strain MW2)
                    42        889  Staphylococcus aureus (strain MSSA476)
                    43        887  Staphylococcus aureus (strain MRSA252)
                    44        885  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    45        884  Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056)  
                    46        884  Oryctolagus cuniculus (Rabbit)
                    47        881  Salmonella choleraesuis
                    48        876  Shigella sonnei (strain Ss046)
                    49        864  Yersinia pseudotuberculosis
                    50        860  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    51        841  Escherichia coli O9:H4 (strain HS)
                    52        837  Candida albicans (Yeast)
                    53        834  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    54        826  Shigella boydii serotype 4 (strain Sb227)
                    55        823  Escherichia coli (strain UTI89 / UPEC)
                    56        819  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    57        816  Candida glabrata (Yeast) (Torulopsis glabrata)
                    58        808  Shigella dysenteriae serotype 1 (strain Sd197)
                    59        801  Neurospora crassa
                    60        794  Vibrio parahaemolyticus
                    61        790  Escherichia coli (strain SMS-3-5 / SECEC)
                    62        780  Canis familiaris (Dog) (Canis lupus familiaris)
                    63        779  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    64        778  Pasteurella multocida
                    65        773  Aquifex aeolicus
                    66        770  Escherichia coli (strain K12 / DH10B)
                    67        764  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    68        764  Escherichia coli (strain K12 / MC4100 / BW2952)
                    69        762  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    70        762  Escherichia coli (strain 55989 / EAEC)
                    71        761  Escherichia coli O8 (strain IAI1)
                    72        757  Shigella flexneri serotype 5b (strain 8401)
                    73        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    74        756  Escherichia coli (strain SE11)
                    75        756  Staphylococcus epidermidis (strain ATCC 12228)
                    76        756  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    77        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    78        753  Streptomyces coelicolor
                    79        747  Emericella nidulans (Aspergillus nidulans)
                    80        746  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    81        741  Photorhabdus luminescens subsp. laumondii
                    82        732  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    83        731  Escherichia coli O81 (strain ED1a)
                    84        731  Bacillus halodurans
                    85        731  Vibrio vulnificus
                    86        724  Bacillus anthracis
                    87        720  Salmonella enteritidis PT4 (strain P125109)
                    88        716  Vibrio vulnificus (strain YJ016)
                    89        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    90        714  Staphylococcus aureus (strain NCTC 8325)
                    91        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    92        713  Salmonella paratyphi A (strain AKU_12601)
                    93        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    94        711  Salmonella agona (strain SL483)
                    95        711  Escherichia coli O1:K1 / APEC
                    96        711  Salmonella newport (strain SL254)
                    97        710  Salmonella heidelberg (strain SL476)
                    98        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    99        709  Salmonella schwarzengrund (strain CVM19633)
                    100        706  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    101        706  Enterobacter sp. (strain 638)
                    102        700  Salmonella dublin (strain CT_02021853)
                    103        697  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        691  Klebsiella pneumoniae (strain 342)
                    105        688  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    106        687  Mycoplasma pneumoniae
                    107        686  Pan troglodytes (Chimpanzee)
                    108        685  Nostoc sp. (strain PCC 7120 / UTEX 2576)
                    109        684  Pseudomonas syringae pv. tomato
                    110        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    111        679  Zea mays (Maize)
                    112        673  Pseudomonas putida (strain KT2440)
                    113        671  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    114        668  Mycobacterium leprae
                    115        666  Staphylococcus aureus (strain USA300)
                    116        666  Yersinia pestis (strain Pestoides F)
                    117        662  Serratia proteamaculans (strain 568)
                    118        658  Rhizobium sp. (strain NGR234)
                    119        650  Bradyrhizobium japonicum
                    120        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    121        641  Escherichia coli
                    122        639  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    123        637  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    124        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    125        636  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    126        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    127        626  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    128        622  Yarrowia lipolytica (Candida lipolytica)
                    129        621  Shewanella oneidensis
                    130        615  Treponema pallidum
                    131        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    132        613  Enterobacter sakazakii (strain ATCC BAA-894)
                    133        611  Staphylococcus haemolyticus (strain JCSC1435)
                    134        606  Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) 
                    135        605  Rhizobium loti (Mesorhizobium loti)
                    136        605  Methanobacterium thermoautotrophicum (strain Delta H)
                    137        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    138        600  Yersinia pestis bv. Antiqua (strain Angola)
                    139        600  Salmonella paratyphi C (strain RKS4594)
                    140        598  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    141        598  Listeria monocytogenes
                    142        590  Bacillus cereus (strain ATCC 10987)
                    143        590  Xanthomonas campestris pv. campestris
                    144        588  Listeria innocua
                    145        585  Rickettsia prowazekii
                    146        585  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    147        584  Helicobacter pylori (Campylobacter pylori)
                    148        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    149        579  Neisseria meningitidis serogroup B
                    150        576  Brucella suis
                    151        572  Brucella melitensis
                    152        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    153        569  Bacillus thuringiensis subsp. konkukian
                    154        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    155        565  Pseudomonas syringae pv. syringae (strain B728a)
                    156        562  Buchnera aphidicola subsp. Schizaphis graminum
                    157        561  Caulobacter crescentus (Caulobacter vibrioides)
                    158        560  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    159        560  Bacillus cereus (strain ZK / E33L)
                    160        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    161        556  Neisseria meningitidis serogroup A
                    162        556  Clostridium acetobutylicum
                    163        556  Xanthomonas axonopodis pv. citri (Citrus canker)
                    164        554  Vibrio fischeri (strain ATCC 700601 / ES114)
                    165        553  Oryza sativa subsp. indica (Rice)
                    166        552  Pseudomonas fluorescens (strain Pf0-1)
                    167        551  Caenorhabditis briggsae
                    168        549  Oceanobacillus iheyensis
                    169        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    170        543  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    171        529  Listeria monocytogenes serotype 4b (strain F2365)
                    172        526  Sodalis glossinidius (strain morsitans)
                    173        526  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    174        522  Xylella fastidiosa
                    175        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    176        519  Streptococcus pneumoniae
                    177        514  Thermotoga maritima
                    178        513  Chromobacterium violaceum
                    179        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    180        510  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    181        507  Bordetella parapertussis
                    182        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    183        507  Pseudomonas aeruginosa (strain PA7)
                    184        507  Haemophilus ducreyi
                    185        506  Bordetella pertussis
                    186        505  Staphylococcus aureus (strain Newman)
                    187        505  Geobacillus kaustophilus
                    188        500  Pseudomonas entomophila (strain L48)
                    189        499  Deinococcus radiodurans
                    190        498  Brucella abortus
                    191        497  Rickettsia conorii
                    192        496  Bacillus clausii (strain KSM-K16)
                    193        494  Haemophilus influenzae (strain 86-028NP)
                    194        493  Corynebacterium glutamicum (Brevibacterium flavum)
                    195        493  Streptomyces avermitilis
                    196        491  Xanthomonas campestris pv. campestris (strain 8004)
                    197        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    198        490  Clostridium perfringens
                    199        488  Bacillus amyloliquefaciens (strain FZB42)
                    200        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    201        487  Shewanella sp. (strain MR-7)
                    202        484  Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)
                    203        484  Mannheimia succiniciproducens (strain MBEL55E)
                    204        484  Pseudomonas aeruginosa (strain LESB58)
                    205        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    206        484  Shewanella sp. (strain MR-4)
                    207        483  Proteus mirabilis (strain HI4320)
                    208        483  Mycoplasma genitalium
                    209        475  Acinetobacter sp. (strain ADP1)
                    210        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    211        474  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    212        474  Thermosynechococcus elongatus (strain BP-1)
                    213        473  Pseudomonas putida (strain F1 / ATCC 700007)
                    214        473  Pyrococcus horikoshii 
                    215        472  Brucella abortus (strain 2308)
                    216        472  Enterococcus faecalis (Streptococcus faecalis)
                    217        466  Rhodopseudomonas palustris
                    218        466  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    219        465  Pyrococcus abyssi (strain GE5 / Orsay)
                    220        465  Pseudomonas putida (strain GB-1)
                    221        464  Shewanella frigidimarina (strain NCIMB 400)
                    222        463  Methanosarcina mazei  
                    223        462  Lactobacillus plantarum
                    224        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    225        462  Shewanella sp. (strain ANA-3)
                    226        461  Burkholderia mallei (Pseudomonas mallei)
                    227        461  Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337) 
                    228        459  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    229        458  Cupriavidus pinatubonensis (strain JMP134 / LMG 1197) (Alcaligenes eutrophus) 
                    230        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    231        455  Staphylococcus aureus (strain JH1)
                    232        455  Halobacterium salinarium (Halobacterium halobium)
                    233        454  Aspergillus oryzae (strain ATCC 42149 / RIB 40)
                    234        454  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    235        453  Rickettsia felis (Rickettsia azadi)
                    236        452  Shewanella baltica (strain OS185)
                    237        452  Pseudomonas putida (strain W619)
                    238        451  Ovis aries (Sheep)
                    239        449  Staphylococcus aureus (strain JH9)
                    240        449  Methylococcus capsulatus
                    241        449  Streptococcus mutans
                    242        449  Aeromonas salmonicida (strain A449)
                    243        448  Thermoanaerobacter tengcongensis
                    244        448  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    245        447  Vibrio fischeri (strain MJ11)
                    246        446  Mycobacterium paratuberculosis
                    247        445  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
                    248        444  Hahella chejuensis (strain KCTC 2396)
                    249        444  Pseudomonas mendocina (strain ymp)
                    250        444  Dechloromonas aromatica (strain RCB)
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18567 (  4%)
                    Bacteria         325783 ( 62%)
                    Eukaryota        167915 ( 32%)
                    Viruses           15783 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20239 ( 12%)           (  4%)
                    Other Mammalia         45109 ( 27%)           (  9%)
                    Other Vertebrata       16636 ( 10%)           (  3%)
                    Viridiplantae          30995 ( 18%)           (  6%)
                    Fungi                  28544 ( 17%)           (  5%)
                    Insecta                 8260 (  5%)           (  2%)
                    Nematoda                4185 (  2%)           (  1%)
                    Other                  13947 (  8%)           (  3%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8643             1001-1100     3607
                    51- 100   40679             1101-1200     2498
                    101- 150   56711             1201-1300     1971
                    151- 200   56645             1301-1400     1822
                    201- 250   55535             1401-1500     1479
                    251- 300   48709             1501-1600      714
                    301- 350   49114             1601-1700      533
                    351- 400   42385             1701-1800      436
                    401- 450   34797             1801-1900      407
                    451- 500   27829             1901-2000      332
                    501- 550   19768             2001-2100      205
                    551- 600   14140             2101-2200      274
                    601- 650   11878             2201-2300      283
                    651- 700    8611             2301-2400      168
                    701- 750    7120             2401-2500      132
                    751- 800    5040             >2500         1049
                    801- 850    4395
                    851- 900    4922
                    901- 950    3731
                    951-1000    2631
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 354 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2125
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  687
                    2x:  278
                    3x:  148
                    4x:  106
                    5x:   93
                    6x:   68
                    7x:   36
                    8x:   41
                    9x:   33
                    10x:   23
                    11- 20x:  170
                    21- 50x:  179
                    51-100x:   98
                    >100x:  165
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18682   Journal of Biological Chemistry
                    2         8662   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5181   Journal of Bacteriology
                    4         4672   Biochemical and Biophysical Research Communications
                    5         4535   Gene
                    6         4367   Nucleic Acids Research
                    7         4064   FEBS Letters
                    8         4039   Biochemistry
                    9         3880   The EMBO Journal
                    10         3546   Molecular and Cellular Biology
                    11         3382   Nature
                    12         3222   Journal of Molecular Biology
                    13         3140   European Journal of Biochemistry
                    14         3031   Biochimica et Biophysica Acta
                    15         2796   Cell
                    16         2485   Genomics
                    17         2236   Biochemical Journal
                    18         2212   Science
                    19         2205   Journal of Virology
                    20         1826   Molecular Microbiology
                    21         1641   Journal of Cell Biology
                    22         1540   Plant Molecular Biology
                    23         1493   Plant Physiology
                    24         1438   Genes and Development
                    25         1416   Virology
                    26         1364   Human Molecular Genetics
                    27         1363   Nature Genetics
                    28         1354   The American Journal of Human Genetics
                    29         1311   Molecular and General Genetics
                    30         1244   Oncogene
                    31         1222   Development
                    32         1185   Journal of Biochemistry
                    33         1157   Human Mutation
                    34         1091   Molecular Biology of the Cell
                    35         1053   The Plant Cell
                    36         1039   Journal of Immunology
                    37         1026   Genetics
                    38          950   Journal of General Virology
                    39          931   Structure
                    40          921   Molecular Cell
                    41          900   Infection and Immunity
                    42          886   The Plant Journal
                    43          849   Archives of Biochemistry and Biophysics
                    44          825   Blood
                    45          788   Journal of Cell Science
                    46          778   Microbiology
                    47          764   Yeast
                    48          756   Developmental Biology
                    49          695   Cancer Research
                    50          694   Current Biology
                    51          680   FEMS Microbiology Letters
                    52          605   Nature Structural Biology
                    53          602   Mechanisms of Development
                    54          601   Human Genetics
                    55          585   Acta Crystallographica, Section D
                    56          582   Protein Science
                    57          571   Applied and Environmental Microbiology
                    58          555   Journal of Neuroscience
                    59          540   Toxicon
                    60          530   Current Genetics
                    61          524   Neuron
                    62          520   Journal of Clinical Investigation
                    63          475   American Journal of Physiology
                    64          473   Mammalian Genome
                    65          456   The Journal of Experimental Medicine
                    66          451   Immunogenetics
                    67          450   Molecular Endocrinology
                    68          424   Molecular and Biochemical Parasitology
                    69          418   Journal of Neurochemistry
                    70          413   The Journal of Clinical Endocrinology and Metabolism
                    71          407   Endocrinology
                    72          403   Proteins
                    73          383   Journal of Molecular Evolution
                    74          382   Bioscience, Biotechnology, and Biochemistry
                    75          369   Journal of Medical Genetics
                    76          367   DNA and Cell Biology
                    77          363   Molecular Biology and Evolution
                    78          362   Plant and Cell Physiology
                    79          359   DNA Sequence
                    80          339   Nature Cell Biology
                    81          321   Tissue Antigens
                    82          317   Experimental Cell Research
                    83          317   Brain Research. Molecular Brain Research
                    84          316   Peptides
                    85          302   Comparative Biochemistry and Physiology
                    86          292   Biological Chemistry Hoppe-Seyler
                    87          290   Antimicrobial Agents and Chemotherapy
                    88          285   Journal of Investigative Dermatology
                    89          276   Cytogenetics and Cell Genetics
                    90          273   Molecular Pharmacology
                    91          271   Developmental Cell
                    92          267   Biology of Reproduction
                    93          267   RNA
                    94          258   Neurology
                    95          253   Genome Research
                    96          252   Virus Research
                    97          251   Journal of General Microbiology
                    98          247   Developmental Dynamics
                    99          241   Planta
                    100          235   Molecular Plant-Microbe Interactions
                    101          221   Nature Structural and Molecular Biology
                    102          218   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    103          212   Genes to Cells
                    104          211   Annals of Neurology
                    105          209   Biochimie
                    106          208   DNA Research
                    107          208   European Journal of Immunology
                    108          205   Immunity
                    109          204   The FEBS Journal
                    110          203   Eukaryotic cell
                    111          203   European Journal of Human Genetics
                    112          200   The New England Journal of Medicine
                    113          193   EMBO Reports
                    114          190   Journal of Human Genetics
                    115          188   PLoS ONE
                    116          180   The FASEB Journal
                    117          179   Investigative Ophthalmology and Visual Science
                    118          177   Molecular and Cellular Endocrinology
                    119          174   Archives of Virology
                    120          169   Archives of Microbiology
                    121          168   Insect Biochemistry and Molecular Biology
                    122          166   American Journal of Medical Genetics
                    123          165   Molecular Immunology
                    124          165   Molecular Phylogenetics and Evolution
                    125          159   DNA
                    126          158   Glycobiology
                    127          157   American Journal of Medical Genetics. Part A
                    128          155   Molecular Reproduction and Development
                    129          155   Diabetes
                    130          153   Hemoglobin
                    131          152   Bioorganicheskaia Khimiia
                    132          151   Clinical Genetics
                    133          151   Journal of the American Chemical Society
                    134          149   BMC Genomics
                    135          149   Journal of Cellular Biochemistry
                    136          146   International Journal of Cancer
                    137          141   Molecular Genetics and Metabolism
                    138          140   Molecular and Cellular Neuroscience
                    139          138   Nature Immunology
                    140          138   Animal Genetics
                    141          137   General and Comparative Endocrinology
                    142          136   Biological Chemistry
                    143          135   Molecular Genetics and Genomics
                    144          132   British Journal of Haematology
                    145          130   Journal of Lipid Research
                    146          128   Proteomics
                    147          126   Journal of Medicinal Chemistry
                    148          124   Circulation Research
                    149          123   Protein Expression and Purification
                    150          122   Agricultural and Biological Chemistry
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       963956                 1.83        
                    Journal                            769296     400764      1.46       1
                    Submitted to EMBL/GenBank/DDBJ     186004     171172      0.35       2
                    Submitted to other databases         6567       6127      0.01       3
                    Book citation                         646        632     <0.01       4
                    Plant Gene Register                   566        554     <0.01       5
                    Thesis402        399     <0.01       6
                    Unpublished observations              292        288     <0.01       7
                    Patent177        175     <0.01       8
                    Worm Breeder's Gazette                  6          6     <0.01       9
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 302327
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2303194                 4.36        
                    ALLERGEN                              499        499     <0.01      26
                    ALTERNATIVE PRODUCTS                19859      19859      0.04      13
                    BIOPHYSICOCHEMICAL PROPERTIES        3588       3588      0.01      23
                    BIOTECHNOLOGY                         307        305     <0.01      28
                    CATALYTIC ACTIVITY                 231981     211448      0.44       4
                    CAUTION                              7566       7412      0.01      19
                    COFACTOR                           102087      93798      0.19       7
                    DEVELOPMENTAL STAGE                  9241       9241      0.02      17
                    DISEASE                              4521       3056      0.01      21
                    DISRUPTION PHENOTYPE                 3618       3618      0.01      22
                    DOMAIN                              34404      30441      0.07      11
                    ENZYME REGULATION                    9568       9568      0.02      16
                    FUNCTION                           399509     382979      0.76       2
                    INDUCTION                           13000      13000      0.02      15
                    INTERACTION                         13012      13012      0.02      14
                    MASS SPECTROMETRY                    4816       3657      0.01      20
                    MISCELLANEOUS                       30910      28535      0.06      12
                    PATHWAY                            129227     117900      0.24       6
                    PHARMACEUTICAL                         84         84     <0.01      29
                    POLYMORPHISM                          828        788     <0.01      24
                    PTM38264      30730      0.07       9
                    RNA EDITING                           621        621     <0.01      25
                    SEQUENCE CAUTION                    39253      39253      0.07       8
                    SIMILARITY                         621551     503226      1.18       1
                    SUBCELLULAR LOCATION               311234     305852      0.59       3
                    SUBUNIT                            229074     229074      0.43       5
                    TISSUE SPECIFICITY                  35477      35477      0.07      10
                    TOXIC DOSE                            462        449     <0.01      27
                    WEB RESOURCE                         8633       6915      0.02      18
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3385697                 6.41        
                    ACT_SITE                           132642      79848      0.25       9
                    BINDING                            222377      62826      0.42       4
                    CA_BIND                              3771       1558      0.01      35
                    CARBOHYD                           103595      26228      0.20      13
                    CHAIN                              534565     522463      1.01       1
                    COILED                              18993      12978      0.04      26
                    COMPBIAS                            51541      27045      0.10      18
                    CONFLICT                           119592      41940      0.23      11
                    CROSSLNK                             6034       3613      0.01      34
                    DISULFID                           100925      27238      0.19      15
                    DNA_BIND                            11155      10279      0.02      30
                    DOMAIN                             153809      91886      0.29       6
                    HELIX                              142685      14909      0.27       7
                    INIT_MET                            14970      14970      0.03      27
                    INTRAMEM                             1870        807     <0.01      38
                    LIPID                               10941       6937      0.02      31
                    METAL                              287268      70265      0.54       3
                    MOD_RES                            183988      60783      0.35       5
                    MOTIF                               33752      21743      0.06      23
                    MUTAGEN                             34881       8222      0.07      22
                    NON_CONS                             1945        728     <0.01      37
                    NON_STD                               351        276     <0.01      39
                    NON_TER                             11995       9123      0.02      29
                    NP_BIND                            109911      69746      0.21      12
                    PEPTIDE                              9516       6354      0.02      32
                    PROPEP                              12081      10363      0.02      28
                    REGION                             101524      54596      0.19      14
                    REPEAT                              91268      13508      0.17      16
                    SIGNAL                              36530      36520      0.07      21
                    SITE39240      23159      0.07      20
                    STRAND                             140981      13880      0.27       8
                    TOPO_DOM                           122210      24994      0.23      10
                    TRANSIT                              7520       7433      0.01      33
                    TRANSMEM                           344686      70729      0.65       2
                    TURN33328      11676      0.06      24
                    UNSURE                               2514        497     <0.01      36
                    VAR_SEQ                             40277      17334      0.08      19
                    VARIANT                             81609      16631      0.15      17
                    ZN_FING                             28857      12523      0.05      25
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               14311256                27.10                          
                    2DBase-Ecoli                           85         85     <0.01     122      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     119      2D gel databases                             
                    AGD  890        884     <0.01      98      Organism-specific databases                  
                    Allergome                            1338        809     <0.01      93      Protein family/group databases               
                    ANU-2DPAGE                             23         23     <0.01     128      2D gel databases                             
                    ArachnoServer                         759        755     <0.01     101      Organism-specific databases                  
                    ArrayExpress                        58602      58602      0.11      43      Gene expression databases                    
                    Bgee40056      40046      0.08      47      Gene expression databases                    
                    BindingDB                             297        297     <0.01     115      Other       
                    BioCyc                             252363     243747      0.48      19      Enzyme and pathway databases                 
                    BRENDA                              65284      62481      0.12      41      Enzyme and pathway databases                 
                    CAZy7298       6552      0.01      71      Protein family/group databases               
                    CGD  601        591     <0.01     104      Organism-specific databases                  
                    CleanEx                             30156      29507      0.06      49      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     121      2D gel databases                             
                    ConoServer                            762        736     <0.01      99      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     123      2D gel databases                             
                    CTD65867      65305      0.12      39      Organism-specific databases                  
                    CYGD6638       6556      0.01      73      Organism-specific databases                  
                    dictyBase                            4001       4001      0.01      86      Organism-specific databases                  
                    DIP12509      12387      0.02      65      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     110      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     118      2D gel databases                             
                    DrugBank                             5318       1627      0.01      76      Other       
                    EchoBASE                             4167       4163      0.01      85      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     112      2D gel databases                             
                    EcoGene                              4291       4289      0.01      84      Organism-specific databases                  
                    eggNOG                             219189     219189      0.42      20      Phylogenomic databases                       
                    EMBL                               884752     517798      1.68       3      Sequence databases                           
                    Ensembl                             73251      55193      0.14      37      Genome annotation databases                  
                    EnsemblBacteria                     98159      84886      0.19      29      Genome annotation databases                  
                    EnsemblFungi                        15009      14862      0.03      62      Genome annotation databases                  
                    EnsemblMetazoa                      11337       8717      0.02      66      Genome annotation databases                  
                    EnsemblPlants                       15818      13733      0.03      60      Genome annotation databases                  
                    EnsemblProtists                      4433       4316      0.01      83      Genome annotation databases                  
                    euHCVdb55         44     <0.01     124      Organism-specific databases                  
                    EuPathDB                              305        305     <0.01     114      Organism-specific databases                  
                    FlyBase                              5792       5418      0.01      75      Organism-specific databases                  
                    Gene3D                             313396     243054      0.59      16      Family and domain databases                  
                    GeneCards                           20245      19685      0.04      54      Organism-specific databases                  
                    GeneDB_Spombe                        4978       4934      0.01      78      Organism-specific databases                  
                    GeneFarm                             2932       2918      0.01      89      Organism-specific databases                  
                    GeneID                             470620     450844      0.89       6      Genome annotation databases                  
                    GeneTree                           168189     168147      0.32      22      Phylogenomic databases                       
                    Genevestigator                      65783      65783      0.12      40      Gene expression databases                    
                    GenoList                             7053       7041      0.01      72      Organism-specific databases                  
                    GenomeReviews                      376139     356250      0.71      10      Genome annotation databases                  
                    GermOnline                          41915      41298      0.08      46      Gene expression databases                    
                    GlycoSuiteDB                          272        272     <0.01     116      PTM databases
                    GO2148971     494697      4.07       1      Ontologies  
                    Gramene                              4587       4587      0.01      81      Organism-specific databases                  
                    H-InvDB                             13206      12309      0.03      64      Organism-specific databases                  
                    HAMAP                              309191     309045      0.59      17      Family and domain databases                  
                    HGNC19714      19551      0.04      56      Organism-specific databases                  
                    HOGENOM                            363445     363445      0.69      12      Phylogenomic databases                       
                    HOVERGEN                            74808      74808      0.14      36      Phylogenomic databases                       
                    HPA13305      10020      0.03      63      Organism-specific databases                  
                    HSSP29790      29790      0.06      50      3D structure databases                       
                    InParanoid                          67534      67534      0.13      38      Phylogenomic databases                       
                    IntAct                              25144      25144      0.05      52      Protein-protein interaction databases        
                    InterPro                          1733672     503080      3.28       2      Family and domain databases                  
                    IPI91225      65257      0.17      31      Sequence databases                           
                    KEGG                               451437     429807      0.85       8      Genome annotation databases                  
                    LegioList                             761        759     <0.01     100      Organism-specific databases                  
                    Leproma                               671        668     <0.01     103      Organism-specific databases                  
                    MaizeGDB                              477        473     <0.01     108      Organism-specific databases                  
                    MEROPS                              10551      10220      0.02      67      Protein family/group databases               
                    MGI16261      16216      0.03      59      Organism-specific databases                  
                    MIM16646      12971      0.03      58      Organism-specific databases                  
                    MINT17535      17535      0.03      57      Protein-protein interaction databases        
                    NextBio                             48950      48948      0.09      44      Other       
                    neXtProt                            20059      20058      0.04      55      Organism-specific databases                  
                    NMPDR                              132072     132067      0.25      26      Genome annotation databases                  
                    OGP  377        377     <0.01     111      2D gel databases                             
                    OMA371522     371522      0.70      11      Phylogenomic databases                       
                    Orphanet                             3759       2285      0.01      87      Organism-specific databases                  
                    OrthoDB                             76902      76846      0.15      33      Phylogenomic databases                       
                    PANTHER                            157964     150571      0.30      23      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      82      Enzyme and pathway databases                 
                    PDB75135      16775      0.14      35      3D structure databases                       
                    PDBsum                              75135      16775      0.14      34      3D structure databases                       
                    PeptideAtlas                         5166       5166      0.01      77      Proteomic databases                          
                    PeroxiBase                            739        728     <0.01     102      Protein family/group databases               
                    Pfam                               699859     490081      1.33       4      Family and domain databases                  
                    PharmGKB                            15420      15113      0.03      61      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     117      2D gel databases                             
                    PhosphoSite                         23762      23762      0.04      53      PTM databases
                    PhosSite                              351        351     <0.01     113      PTM databases
                    PhylomeDB                          123071     123071      0.23      27      Phylogenomic databases                       
                    PIR116709     106685      0.22      28      Sequence databases                           
                    PIRSF                               86310      86310      0.16      32      Family and domain databases                  
                    PMAP-CutDB                           1394       1394     <0.01      92      Other       
                    PMMA-2DPAGE                            52         52     <0.01     125      2D gel databases                             
                    PptaseDB                               34         34     <0.01     126      Protein family/group databases               
                    PRIDE                               62042      62042      0.12      42      Proteomic databases                          
                    PRINTS                             137400     119029      0.26      25      Family and domain databases                  
                    ProDom                              27806      27627      0.05      51      Family and domain databases                  
                    ProMEX478        478     <0.01     107      Proteomic databases                          
                    PROSITE                            469090     297801      0.89       7      Family and domain databases                  
                    ProtClustDB                        340197     340197      0.64      14      Phylogenomic databases                       
                    ProteinModelPortal                 417687     417687      0.79       9      3D structure databases                       
                    PseudoCAP                            1224       1215     <0.01      95      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     127      2D gel databases                             
                    Reactome                             9367       5611      0.02      69      Enzyme and pathway databases                 
                    REBASE441        398     <0.01     109      Protein family/group databases               
                    RefSeq                             494847     451740      0.94       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1256       1035     <0.01      94      2D gel databases                             
                    RGD 7510       7506      0.01      70      Organism-specific databases                  
                    SGD 6638       6573      0.01      74      Organism-specific databases                  
                    Siena-2DPAGE                          102        102     <0.01     120      2D gel databases                             
                    SMART                              156080     118828      0.30      24      Family and domain databases                  
                    SMR349504     349504      0.66      13      3D structure databases                       
                    STRING                             206360     206325      0.39      21      Protein-protein interaction databases        
                    SUPFAM                             319922     253698      0.61      15      Family and domain databases                  
                    SWISS-2DPAGE                         1184       1183     <0.01      96      2D gel databases                             
                    TAIR10392      10306      0.02      68      Organism-specific databases                  
                    TCDB3538       3525      0.01      88      Protein family/group databases               
                    TIGR34330      33559      0.07      48      Genome annotation databases                  
                    TIGRFAMs                           285619     265456      0.54      18      Family and domain databases                  
                    TubercuList                          1889       1853     <0.01      91      Organism-specific databases                  
                    UCD-2DPAGE                            511        502     <0.01     106      2D gel databases                             
                    UCSC48701      39693      0.09      45      Genome annotation databases                  
                    UniGene                             93137      85582      0.18      30      Sequence databases                           
                    VectorBase                            536        523     <0.01     105      Genome annotation databases                  
                    World-2DPAGE                          917        906     <0.01      97      2D gel databases                             
                    WormBase                             4641       3818      0.01      79      Organism-specific databases                  
                    Xenbase                              4619       4598      0.01      80      Organism-specific databases                  
                    ZFIN2657       2645      0.01      90      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 128
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.26   Gln (Q) 3.93   Leu (L) 9.66   Ser (S) 6.53
                    Arg (R) 5.53   Glu (E) 6.75   Lys (K) 5.85   Thr (T) 5.33
                    Asn (N) 4.06   Gly (G) 7.08   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.97   Pro (P) 4.69   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4445 entries are encoded on a mitochondrion, and 3609 are encoded on a plasmid.
                    
                    12183 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11619 on chloroplasts, 
                    50 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 70847