Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
UniProtKB/Swiss-Prot protein knowledgebase release 2011_01 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2011_01 of 11-Jan-11 of UniProtKB/Swiss-Prot contains 524420 sequence entries,
                    comprising 185205850 amino acids abstracted from 194602 references. 
                    
                    1319 sequences have been added since release 2010_12, the sequence data of
                    338 existing entries has been updated and the annotations of
                    155669 entries have been revised.
                    
                    Number of fragments: 8815
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 29850
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        71885   13.7%
                    2: Evidence at transcript level     68424     13%
                    3: Inferred from homology          368122   70.2%
                    4: Predicted                        14323    2.7%
                    5: Uncertain                         1666    0.3%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12360
                    
                    The first twenty species represent 109415 sequences:  20.9 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5292
                    2x: 1761
                    3x:  931
                    4x:  604
                    5x:  445
                    6x:  354
                    7x:  257
                    8x:  209
                    9x:  195
                    10x:  106
                    11- 20x:  615
                    21- 50x:  384
                    51-100x:  183
                    >100x: 1024
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20252  Homo sapiens (Human)
                    2      16332  Mus musculus (Mouse)
                    3      10014  Arabidopsis thaliana (Mouse-ear cress)
                    4       7575  Rattus norvegicus (Rat)
                    5       6583  Saccharomyces cerevisiae (Baker's yeast)
                    6       5811  Bos taurus (Bovine)
                    7       4976  Schizosaccharomyces pombe (Fission yeast)
                    8       4430  Escherichia coli (strain K12)
                    9       4244  Bacillus subtilis
                    10       4231  Dictyostelium discoideum (Slime mold)
                    11       3319  Caenorhabditis elegans
                    12       3283  Xenopus laevis (African clawed frog)
                    13       3104  Drosophila melanogaster (Fruit fly)
                    14       2702  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2610  Oryza sativa subsp. japonica (Rice)
                    16       2212  Pongo abelii (Sumatran orangutan)
                    17       2183  Gallus gallus (Chicken)
                    18       1996  Escherichia coli O157:H7
                    19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
                    20       1776  Salmonella typhimurium
                    21       1771  Haemophilus influenzae
                    22       1733  Mycobacterium tuberculosis
                    23       1674  Shigella flexneri
                    24       1671  Escherichia coli O6
                    25       1578  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1379  Sus scrofa (Pig)
                    27       1342  Salmonella typhi
                    28       1283  Pseudomonas aeruginosa
                    29       1224  Mycobacterium bovis
                    30       1164  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
                    31       1023  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / N-1)
                    32        998  Yersinia pestis
                    33        993  Archaeoglobus fulgidus
                    34        944  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        913  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        897  Staphylococcus aureus (strain COL)
                    41        895  Staphylococcus aureus (strain MW2)
                    42        889  Staphylococcus aureus (strain MSSA476)
                    43        886  Staphylococcus aureus (strain MRSA252)
                    44        885  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    45        883  Oryctolagus cuniculus (Rabbit)
                    46        881  Salmonella choleraesuis
                    47        874  Shigella sonnei (strain Ss046)
                    48        864  Yersinia pseudotuberculosis
                    49        853  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    50        841  Escherichia coli O9:H4 (strain HS)
                    51        834  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    52        828  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    53        825  Shigella boydii serotype 4 (strain Sb227)
                    54        823  Escherichia coli (strain UTI89 / UPEC)
                    55        822  Candida albicans (Yeast)
                    56        819  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    57        808  Shigella dysenteriae serotype 1 (strain Sd197)
                    58        794  Vibrio parahaemolyticus
                    59        790  Escherichia coli (strain SMS-3-5 / SECEC)
                    60        788  Neurospora crassa
                    61        787  Candida glabrata (Yeast) (Torulopsis glabrata)
                    62        779  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    63        777  Pasteurella multocida
                    64        773  Aquifex aeolicus
                    65        770  Escherichia coli (strain K12 / DH10B)
                    66        765  Canis familiaris (Dog) (Canis lupus familiaris)
                    67        764  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    68        764  Escherichia coli (strain K12 / MC4100 / BW2952)
                    69        762  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    70        762  Escherichia coli (strain 55989 / EAEC)
                    71        761  Escherichia coli O8 (strain IAI1)
                    72        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    73        756  Escherichia coli (strain SE11)
                    74        756  Shigella flexneri serotype 5b (strain 8401)
                    75        756  Staphylococcus epidermidis (strain ATCC 12228)
                    76        756  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    77        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    78        747  Streptomyces coelicolor
                    79        746  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    80        738  Photorhabdus luminescens subsp. laumondii
                    81        731  Escherichia coli O81 (strain ED1a)
                    82        731  Bacillus halodurans
                    83        731  Vibrio vulnificus
                    84        731  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    85        725  Emericella nidulans (Aspergillus nidulans)
                    86        724  Bacillus anthracis
                    87        720  Salmonella enteritidis PT4 (strain P125109)
                    88        715  Vibrio vulnificus (strain YJ016)
                    89        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    90        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    91        713  Staphylococcus aureus (strain NCTC 8325)
                    92        713  Salmonella paratyphi A (strain AKU_12601)
                    93        712  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    94        711  Salmonella agona (strain SL483)
                    95        711  Escherichia coli O1:K1 / APEC
                    96        711  Salmonella newport (strain SL254)
                    97        710  Salmonella heidelberg (strain SL476)
                    98        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    99        709  Salmonella schwarzengrund (strain CVM19633)
                    100        704  Enterobacter sp. (strain 638)
                    101        703  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    102        700  Salmonella dublin (strain CT_02021853)
                    103        697  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        691  Klebsiella pneumoniae (strain 342)
                    105        687  Mycoplasma pneumoniae
                    106        686  Pan troglodytes (Chimpanzee)
                    107        686  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    108        685  Nostoc sp. (strain PCC 7120 / UTEX 2576)
                    109        684  Pseudomonas syringae pv. tomato
                    110        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    111        671  Pseudomonas putida (strain KT2440)
                    112        669  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    113        668  Mycobacterium leprae
                    114        666  Staphylococcus aureus (strain USA300)
                    115        666  Zea mays (Maize)
                    116        666  Yersinia pestis (strain Pestoides F)
                    117        660  Serratia proteamaculans (strain 568)
                    118        658  Rhizobium sp. (strain NGR234)
                    119        650  Bradyrhizobium japonicum
                    120        644  Escherichia coli
                    121        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    122        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    123        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    124        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    125        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    126        625  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    127        621  Shewanella oneidensis
                    128        618  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    129        615  Treponema pallidum
                    130        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    131        613  Enterobacter sakazakii (strain ATCC BAA-894)
                    132        610  Staphylococcus haemolyticus (strain JCSC1435)
                    133        608  Yarrowia lipolytica (Candida lipolytica)
                    134        605  Rhizobium loti (Mesorhizobium loti)
                    135        603  Methanobacterium thermoautotrophicum
                    136        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    137        600  Salmonella paratyphi C (strain RKS4594)
                    138        598  Yersinia pestis bv. Antiqua (strain Angola)
                    139        597  Listeria monocytogenes
                    140        596  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    141        590  Bacillus cereus (strain ATCC 10987)
                    142        590  Xanthomonas campestris pv. campestris
                    143        588  Listeria innocua
                    144        587  Aspergillus fumigatus (Sartorya fumigata)
                    145        585  Rickettsia prowazekii
                    146        584  Helicobacter pylori (Campylobacter pylori)
                    147        584  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    148        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    149        579  Neisseria meningitidis serogroup B
                    150        576  Brucella suis
                    151        572  Brucella melitensis
                    152        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    153        569  Bacillus thuringiensis subsp. konkukian
                    154        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    155        565  Pseudomonas syringae pv. syringae (strain B728a)
                    156        562  Buchnera aphidicola subsp. Schizaphis graminum
                    157        560  Bacillus cereus (strain ZK / E33L)
                    158        558  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    159        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    160        556  Neisseria meningitidis serogroup A
                    161        555  Xanthomonas axonopodis pv. citri (Citrus canker)
                    162        553  Vibrio fischeri (strain ATCC 700601 / ES114)
                    163        551  Pseudomonas fluorescens (strain Pf0-1)
                    164        549  Oceanobacillus iheyensis
                    165        548  Clostridium acetobutylicum
                    166        547  Caulobacter crescentus (Caulobacter vibrioides)
                    167        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    168        544  Caenorhabditis briggsae
                    169        543  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    170        541  Oryza sativa subsp. indica (Rice)
                    171        529  Listeria monocytogenes serotype 4b (strain F2365)
                    172        524  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    173        522  Sodalis glossinidius (strain morsitans)
                    174        522  Xylella fastidiosa
                    175        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    176        519  Streptococcus pneumoniae
                    177        513  Chromobacterium violaceum
                    178        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    179        511  Thermotoga maritima
                    180        509  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    181        507  Bordetella parapertussis
                    182        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    183        507  Pseudomonas aeruginosa (strain PA7)
                    184        506  Bordetella pertussis
                    185        505  Staphylococcus aureus (strain Newman)
                    186        505  Haemophilus ducreyi
                    187        504  Geobacillus kaustophilus
                    188        500  Pseudomonas entomophila (strain L48)
                    189        498  Deinococcus radiodurans
                    190        498  Brucella abortus
                    191        497  Rickettsia conorii
                    192        496  Bacillus clausii (strain KSM-K16)
                    193        492  Haemophilus influenzae (strain 86-028NP)
                    194        492  Streptomyces avermitilis
                    195        491  Corynebacterium glutamicum (Brevibacterium flavum)
                    196        490  Xanthomonas campestris pv. campestris (strain 8004)
                    197        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    198        490  Clostridium perfringens
                    199        488  Bacillus amyloliquefaciens (strain FZB42)
                    200        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    201        487  Shewanella sp. (strain MR-7)
                    202        484  Pseudomonas aeruginosa (strain LESB58)
                    203        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    204        484  Shewanella sp. (strain MR-4)
                    205        483  Mannheimia succiniciproducens (strain MBEL55E)
                    206        483  Mycoplasma genitalium
                    207        482  Methanosarcina acetivorans
                    208        481  Proteus mirabilis (strain HI4320)
                    209        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    210        474  Acinetobacter sp. (strain ADP1)
                    211        474  Thermosynechococcus elongatus (strain BP-1)
                    212        473  Pseudomonas putida (strain F1 / ATCC 700007)
                    213        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    214        472  Brucella abortus (strain 2308)
                    215        469  Enterococcus faecalis (Streptococcus faecalis)
                    216        466  Pyrococcus horikoshii
                    217        465  Rhodopseudomonas palustris
                    218        465  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    219        465  Pseudomonas putida (strain GB-1)
                    220        464  Shewanella frigidimarina (strain NCIMB 400)
                    221        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    222        462  Shewanella sp. (strain ANA-3)
                    223        461  Burkholderia mallei (Pseudomonas mallei)
                    224        461  Lactobacillus plantarum
                    225        461  Methanosarcina mazei (Methanosarcina frisia)
                    226        460  Ralstonia eutropha  (Cupriavidus necator 
                    227        459  Pyrococcus abyssi
                    228        458  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    229        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    230        457  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    231        455  Staphylococcus aureus (strain JH1)
                    232        454  Halobacterium salinarium (Halobacterium halobium)
                    233        453  Rickettsia felis (Rickettsia azadi)
                    234        453  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    235        452  Shewanella baltica (strain OS185)
                    236        452  Pseudomonas putida (strain W619)
                    237        450  Ovis aries (Sheep)
                    238        449  Staphylococcus aureus (strain JH9)
                    239        449  Methylococcus capsulatus
                    240        449  Streptococcus mutans
                    241        448  Thermoanaerobacter tengcongensis
                    242        448  Aeromonas salmonicida (strain A449)
                    243        446  Mycobacterium paratuberculosis
                    244        446  Vibrio fischeri (strain MJ11)
                    245        446  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    246        444  Hahella chejuensis (strain KCTC 2396)
                    247        444  Pseudomonas mendocina (strain ymp)
                    248        443  Dechloromonas aromatica (strain RCB)
                    249        441  Streptococcus pyogenes serotype M6
                    250        441  Nicotiana tabacum (Common tobacco)
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18393 (  4%)
                    Bacteria         325204 ( 62%)
                    Eukaryota        165635 ( 32%)
                    Viruses           15188 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20253 ( 12%)           (  4%)
                    Other Mammalia         44973 ( 27%)           (  9%)
                    Other Vertebrata       16426 ( 10%)           (  3%)
                    Viridiplantae          30501 ( 18%)           (  6%)
                    Fungi                  27318 ( 16%)           (  5%)
                    Insecta                 8190 (  5%)           (  2%)
                    Nematoda                4165 (  3%)           (  1%)
                    Other                  13809 (  8%)           (  3%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8580             1001-1100     3583
                    51- 100   40417             1101-1200     2478
                    101- 150   56407             1201-1300     1947
                    151- 200   56450             1301-1400     1803
                    201- 250   55264             1401-1500     1437
                    251- 300   48486             1501-1600      640
                    301- 350   48945             1601-1700      518
                    351- 400   42132             1701-1800      431
                    401- 450   34472             1801-1900      399
                    451- 500   27640             1901-2000      327
                    501- 550   19558             2001-2100      202
                    551- 600   13967             2101-2200      269
                    601- 650   11737             2201-2300      281
                    651- 700    8453             2301-2400      168
                    701- 750    6961             2401-2500      130
                    751- 800    4993             >2500         1035
                    801- 850    4351
                    851- 900    4865
                    901- 950    3694
                    951-1000    2585
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 353 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2109
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  684
                    2x:  277
                    3x:  153
                    4x:   97
                    5x:   94
                    6x:   67
                    7x:   37
                    8x:   38
                    9x:   31
                    10x:   25
                    11- 20x:  173
                    21- 50x:  171
                    51-100x:  100
                    >100x:  162
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18420   Journal of Biological Chemistry
                    2         8513   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5110   Journal of Bacteriology
                    4         4606   Biochemical and Biophysical Research Communications
                    5         4526   Gene
                    6         4331   Nucleic Acids Research
                    7         4037   FEBS Letters
                    8         3997   Biochemistry
                    9         3819   The EMBO Journal
                    10         3493   Molecular and Cellular Biology
                    11         3321   Nature
                    12         3148   Journal of Molecular Biology
                    13         3126   European Journal of Biochemistry
                    14         3009   Biochimica et Biophysica Acta
                    15         2752   Cell
                    16         2482   Genomics
                    17         2212   Biochemical Journal
                    18         2182   Science
                    19         2082   Journal of Virology
                    20         1803   Molecular Microbiology
                    21         1615   Journal of Cell Biology
                    22         1522   Plant Molecular Biology
                    23         1423   Plant Physiology
                    24         1408   Genes and Development
                    25         1381   Virology
                    26         1349   Human Molecular Genetics
                    27         1345   Nature Genetics
                    28         1322   The American Journal of Human Genetics
                    29         1307   Molecular and General Genetics
                    30         1219   Oncogene
                    31         1200   Development
                    32         1174   Journal of Biochemistry
                    33         1120   Human Mutation
                    34         1059   Molecular Biology of the Cell
                    35         1031   Journal of Immunology
                    36         1008   Genetics
                    37          972   The Plant Cell
                    38          917   Structure
                    39          913   Journal of General Virology
                    40          893   Infection and Immunity
                    41          884   Molecular Cell
                    42          840   Archives of Biochemistry and Biophysics
                    43          830   The Plant Journal
                    44          812   Blood
                    45          770   Microbiology
                    46          764   Journal of Cell Science
                    47          761   Yeast
                    48          740   Developmental Biology
                    49          681   Cancer Research
                    50          670   Current Biology
                    51          669   FEMS Microbiology Letters
                    52          601   Nature Structural Biology
                    53          598   Mechanisms of Development
                    54          596   Human Genetics
                    55          576   Acta Crystallographica, Section D
                    56          565   Protein Science
                    57          555   Applied and Environmental Microbiology
                    58          545   Journal of Neuroscience
                    59          530   Toxicon
                    60          526   Current Genetics
                    61          520   Neuron
                    62          516   Journal of Clinical Investigation
                    63          473   Mammalian Genome
                    64          472   American Journal of Physiology
                    65          454   The Journal of Experimental Medicine
                    66          451   Immunogenetics
                    67          446   Molecular Endocrinology
                    68          424   Molecular and Biochemical Parasitology
                    69          414   Journal of Neurochemistry
                    70          412   The Journal of Clinical Endocrinology and Metabolism
                    71          399   Endocrinology
                    72          386   Proteins
                    73          382   Journal of Molecular Evolution
                    74          374   Bioscience, Biotechnology, and Biochemistry
                    75          367   DNA and Cell Biology
                    76          361   Molecular Biology and Evolution
                    77          360   Journal of Medical Genetics
                    78          358   DNA Sequence
                    79          340   Plant and Cell Physiology
                    80          326   Nature Cell Biology
                    81          321   Tissue Antigens
                    82          315   Brain Research. Molecular Brain Research
                    83          312   Peptides
                    84          308   Experimental Cell Research
                    85          300   Comparative Biochemistry and Physiology
                    86          290   Biological Chemistry Hoppe-Seyler
                    87          289   Antimicrobial Agents and Chemotherapy
                    88          283   Journal of Investigative Dermatology
                    89          276   Cytogenetics and Cell Genetics
                    90          271   Molecular Pharmacology
                    91          265   Biology of Reproduction
                    92          263   Developmental Cell
                    93          252   Genome Research
                    94          248   Journal of General Microbiology
                    95          248   Neurology
                    96          246   RNA
                    97          246   Virus Research
                    98          244   Developmental Dynamics
                    99          230   Molecular Plant-Microbe Interactions
                    100          226   Planta
                    101          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    102          207   European Journal of Immunology
                    103          206   Biochimie
                    104          206   DNA Research
                    105          206   Genes to Cells
                    106          204   Annals of Neurology
                    107          204   Nature Structural and Molecular Biology
                    108          201   Immunity
                    109          200   Eukaryotic cell
                    110          197   The FEBS Journal
                    111          196   The New England Journal of Medicine
                    112          195   European Journal of Human Genetics
                    113          188   Journal of Human Genetics
                    114          180   EMBO Reports
                    115          176   Molecular and Cellular Endocrinology
                    116          176   Investigative Ophthalmology and Visual Science
                    117          172   The FASEB Journal
                    118          168   Archives of Virology
                    119          167   Archives of Microbiology
                    120          165   American Journal of Medical Genetics
                    121          164   Molecular Phylogenetics and Evolution
                    122          162   Insect Biochemistry and Molecular Biology
                    123          160   Molecular Immunology
                    124          159   DNA
                    125          155   Molecular Reproduction and Development
                    126          153   Diabetes
                    127          153   Hemoglobin
                    128          152   Bioorganicheskaia Khimiia
                    129          151   Glycobiology
                    130          151   PLoS ONE
                    131          149   Clinical Genetics
                    132          147   Journal of the American Chemical Society
                    133          144   International Journal of Cancer
                    134          143   Journal of Cellular Biochemistry
                    135          141   Molecular Genetics and Metabolism
                    136          140   BMC Genomics
                    137          140   Molecular and Cellular Neuroscience
                    138          137   General and Comparative Endocrinology
                    139          137   Animal Genetics
                    140          136   Nature Immunology
                    141          133   Biological Chemistry
                    142          132   American Journal of Medical Genetics. Part A
                    143          130   British Journal of Haematology
                    144          129   Molecular Genetics and Genomics
                    145          126   Proteomics
                    146          124   Journal of Lipid Research
                    147          122   Journal of Medicinal Chemistry
                    148          122   Circulation Research
                    149          122   Agricultural and Biological Chemistry
                    150          120   Protein Expression and Purification
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       951683                 1.81                                         
                    Journal                            753249     396101      1.44       1                                 
                    Submitted to EMBL/GenBank/DDBJ     185619     171422      0.35       2                                 
                    Submitted to other databases        10730       9286      0.02       3                                 
                    Book citation                         646        632     <0.01       4                                 
                    Plant Gene Register                   566        554     <0.01       5                                 
                    Thesis                                402        399     <0.01       6                                 
                    Unpublished observations              293        289     <0.01       7                                 
                    Patent                                172        170     <0.01       8                                 
                    Worm Breeder's Gazette                  6          6     <0.01       9                                 
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 297602
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2276051                 4.34                                         
                    ALLERGEN                              496        496     <0.01      26                                 
                    ALTERNATIVE PRODUCTS                19279      19279      0.04      13                                 
                    BIOPHYSICOCHEMICAL PROPERTIES        3405       3405      0.01      22                                 
                    BIOTECHNOLOGY                         282        280     <0.01      28                                 
                    CATALYTIC ACTIVITY                 230164     209712      0.44       4                                 
                    CAUTION                              7251       7104      0.01      19                                 
                    COFACTOR                           101772      93526      0.19       7                                 
                    DEVELOPMENTAL STAGE                  9115       9115      0.02      17                                 
                    DISEASE                              4440       3003      0.01      21                                 
                    DISRUPTION PHENOTYPE                 3266       3266      0.01      23                                 
                    DOMAIN                              33390      29552      0.06      11                                 
                    ENZYME REGULATION                    9444       9444      0.02      16                                 
                    FUNCTION                           394284     377974      0.75       2                                 
                    INDUCTION                           12683      12683      0.02      15                                 
                    INTERACTION                         12777      12777      0.02      14                                 
                    MASS SPECTROMETRY                    4675       3556      0.01      20                                 
                    MISCELLANEOUS                       30787      28415      0.06      12                                 
                    PATHWAY                            128518     117446      0.25       6                                 
                    PHARMACEUTICAL                         84         84     <0.01      29                                 
                    POLYMORPHISM                          810        771     <0.01      24                                 
                    PTM                                 37774      30370      0.07       9                                 
                    RNA EDITING                           619        619     <0.01      25                                 
                    SEQUENCE CAUTION                    38753      38753      0.07       8                                 
                    SIMILARITY                         615420     499633      1.17       1                                 
                    SUBCELLULAR LOCATION               306366     301149      0.58       3                                 
                    SUBUNIT                            226364     226364      0.43       5                                 
                    TISSUE SPECIFICITY                  34772      34772      0.07      10                                 
                    TOXIC DOSE                            456        445     <0.01      27                                 
                    WEB RESOURCE                         8605       6889      0.02      18                                 
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3336557                 6.36                                         
                    ACT_SITE                           131434      79300      0.25       9                                 
                    BINDING                            217533      60774      0.41       4                                 
                    CA_BIND                              3769       1556      0.01      35                                 
                    CARBOHYD                           102192      25863      0.19      13                                 
                    CHAIN                              531030     519062      1.01       1                                 
                    COILED                              18701      12753      0.04      26                                 
                    COMPBIAS                            50568      26465      0.10      18                                 
                    CONFLICT                           118702      41630      0.23      11                                 
                    CROSSLNK                             5967       3575      0.01      34                                 
                    DISULFID                            99330      26672      0.19      14                                 
                    DNA_BIND                            10989      10122      0.02      30                                 
                    DOMAIN                             151329      90488      0.29       6                                 
                    HELIX                              137748      14392      0.26       7                                 
                    INIT_MET                            14945      14945      0.03      27                                 
                    INTRAMEM                             1588        748     <0.01      38                                 
                    LIPID                               10683       6801      0.02      31                                 
                    METAL                              284654      70005      0.54       3                                 
                    MOD_RES                            182564      60443      0.35       5                                 
                    MOTIF                               32902      21166      0.06      23                                 
                    MUTAGEN                             33449       7924      0.06      22                                 
                    NON_CONS                             1905        722     <0.01      37                                 
                    NON_STD                               351        276     <0.01      39                                 
                    NON_TER                             11945       9082      0.02      28                                 
                    NP_BIND                            108035      69121      0.21      12                                 
                    PEPTIDE                              9233       6114      0.02      32                                 
                    PROPEP                              11678      10009      0.02      29                                 
                    REGION                              98777      53299      0.19      15                                 
                    REPEAT                              90642      13363      0.17      16                                 
                    SIGNAL                              35899      35889      0.07      21                                 
                    SITE                                38503      22900      0.07      20                                 
                    STRAND                             136783      13411      0.26       8                                 
                    TOPO_DOM                           119436      24570      0.23      10                                 
                    TRANSIT                              7237       7150      0.01      33                                 
                    TRANSMEM                           342025      70129      0.65       2                                 
                    TURN                                32453      11313      0.06      24                                 
                    UNSURE                               2281        448     <0.01      36                                 
                    VAR_SEQ                             39857      17137      0.08      19                                 
                    VARIANT                             80844      16560      0.15      17                                 
                    ZN_FING                             28596      12434      0.05      25                                 
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               13952328                26.61                                                           
                    2DBase-Ecoli                           85         85     <0.01     120      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     117      2D gel databases                             
                    AGD                                   859        853     <0.01      96      Organism-specific databases                  
                    Allergome                            1305        786     <0.01      91      Protein family/group databases               
                    ANU-2DPAGE                             23         23     <0.01     126      2D gel databases                             
                    ArachnoServer                         759        755     <0.01      98      Organism-specific databases                  
                    ArrayExpress                        58468      58468      0.11      40      Gene expression databases                    
                    Bgee                                39549      39529      0.08      46      Gene expression databases                    
                    BindingDB                             297        297     <0.01     113      Other                                        
                    BioCyc                             252037     243427      0.48      19      Enzyme and pathway databases                 
                    BRENDA                              65233      62433      0.12      39      Enzyme and pathway databases                 
                    CAZy                                 7240       6495      0.01      69      Protein family/group databases               
                    CGD                                   585        576     <0.01     102      Organism-specific databases                  
                    CleanEx                             30173      29523      0.06      48      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     119      2D gel databases                             
                    ConoServer                            613        587     <0.01     101      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     121      2D gel databases                             
                    CTD                                 65533      64951      0.12      37      Organism-specific databases                  
                    CYGD                                 6638       6543      0.01      72      Organism-specific databases                  
                    dictyBase                            4114       4114      0.01      84      Organism-specific databases                  
                    DIP                                 12495      12373      0.02      63      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     108      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     116      2D gel databases                             
                    DrugBank                             5317       1626      0.01      74      Other                                        
                    EchoBASE                             4167       4163      0.01      83      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     110      2D gel databases                             
                    EcoGene                              4291       4289      0.01      82      Organism-specific databases                  
                    eggNOG                             218565     218565      0.42      20      Phylogenomic databases                       
                    EMBL                               874031     514282      1.67       3      Sequence databases                           
                    Ensembl                             74638      58151      0.14      33      Genome annotation databases                  
                    EnsemblBacteria                     97838      84677      0.19      28      Genome annotation databases                  
                    EnsemblFungi                        14721      14579      0.03      59      Genome annotation databases                  
                    EnsemblMetazoa                      12855       8450      0.02      62      Genome annotation databases                  
                    EnsemblPlants                       14116      12534      0.03      60      Genome annotation databases                  
                    EnsemblProtists                      4311       4196      0.01      81      Genome annotation databases                  
                    euHCVdb                                55         44     <0.01     122      Organism-specific databases                  
                    EuPathDB                              300        300     <0.01     112      Organism-specific databases                  
                    FlyBase                              5753       5379      0.01      73      Organism-specific databases                  
                    Gene3D                             293498     231778      0.56      17      Family and domain databases                  
                    GeneCards                           20442      19806      0.04      52      Organism-specific databases                  
                    GeneDB_Spombe                        4978       4934      0.01      76      Organism-specific databases                  
                    GeneFarm                             2741       2727      0.01      87      Organism-specific databases                  
                    GeneID                             462269     442899      0.88       7      Genome annotation databases                  
                    Genevestigator                      65483      65483      0.12      38      Gene expression databases                    
                    GenoList                             7049       7037      0.01      70      Organism-specific databases                  
                    GenomeReviews                      380384     360316      0.73      10      Genome annotation databases                  
                    GermOnline                          41925      41307      0.08      45      Gene expression databases                    
                    GlycoSuiteDB                          272        272     <0.01     114      PTM databases                                
                    GO                                2106261     491406      4.02       1      Ontologies                                   
                    Gramene                              4515       4515      0.01      79      Organism-specific databases                  
                    H-InvDB                             13208      12310      0.03      61      Organism-specific databases                  
                    HAMAP                              308967     308822      0.59      16      Family and domain databases                  
                    HGNC                                19709      19535      0.04      54      Organism-specific databases                  
                    HOGENOM                            362416     362416      0.69      12      Phylogenomic databases                       
                    HOVERGEN                            74671      74671      0.14      32      Phylogenomic databases                       
                    HPA                                 11297       8336      0.02      64      Organism-specific databases                  
                    HSSP                                29566      29566      0.06      49      3D structure databases                       
                    InParanoid                          67149      67149      0.13      36      Phylogenomic databases                       
                    IntAct                              24428      24428      0.05      51      Protein-protein interaction databases        
                    InterPro                          1711079     499757      3.26       2      Family and domain databases                  
                    IPI                                 90421      64819      0.17      30      Sequence databases                           
                    KEGG                               440271     419531      0.84       8      Genome annotation databases                  
                    LegioList                             761        759     <0.01      97      Organism-specific databases                  
                    Leproma                               671        668     <0.01     100      Organism-specific databases                  
                    MaizeGDB                              476        472     <0.01     104      Organism-specific databases                  
                    MEROPS                              10312       9979      0.02      65      Protein family/group databases               
                    MGI                                 16233      16188      0.03      57      Organism-specific databases                  
                    MIM                                 16518      12910      0.03      56      Organism-specific databases                  
                    MINT                                17509      17509      0.03      55      Protein-protein interaction databases        
                    NextBio                             48887      48885      0.09      43      Other                                        
                    NMPDR                              131510     131505      0.25      25      Genome annotation databases                  
                    OGP                                   377        377     <0.01     109      2D gel databases                             
                    OMA                                370528     370528      0.71      11      Phylogenomic databases                       
                    Orphanet                             3759       2285      0.01      85      Organism-specific databases                  
                    OrthoDB                             57061      57061      0.11      41      Phylogenomic databases                       
                    PANTHER                            186866     171423      0.36      22      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      78      Enzyme and pathway databases                 
                    PDB                                 72182      16434      0.14      35      3D structure databases                       
                    PDBsum                              72182      16434      0.14      34      3D structure databases                       
                    PeptideAtlas                         5168       5168      0.01      75      Proteomic databases                          
                    PeroxiBase                            739        728     <0.01      99      Protein family/group databases               
                    Pfam                               696740     488111      1.33       4      Family and domain databases                  
                    PharmGKB                            15423      15117      0.03      58      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     115      2D gel databases                             
                    PhosphoSite                         20435      20435      0.04      53      PTM databases                                
                    PhosSite                              351        351     <0.01     111      PTM databases                                
                    PhylomeDB                          122602     122602      0.23      26      Phylogenomic databases                       
                    PIR                                116151     106142      0.22      27      Sequence databases                           
                    PIRSF                               85309      85309      0.16      31      Family and domain databases                  
                    PMAP-CutDB                           1394       1394     <0.01      90      Other                                        
                    PMMA-2DPAGE                            52         52     <0.01     123      2D gel databases                             
                    PptaseDB                               34         34     <0.01     124      Protein family/group databases               
                    PRIDE                               54342      54342      0.10      42      Proteomic databases                          
                    PRINTS                             137411     119200      0.26      24      Family and domain databases                  
                    ProDom                              27671      27492      0.05      50      Family and domain databases                  
                    ProMEX                                471        471     <0.01     105      Proteomic databases                          
                    PROSITE                            465647     296151      0.89       6      Family and domain databases                  
                    ProtClustDB                        325855     325855      0.62      14      Phylogenomic databases                       
                    ProteinModelPortal                 414099     414099      0.79       9      3D structure databases                       
                    PseudoCAP                            1222       1213     <0.01      93      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     125      2D gel databases                             
                    Reactome                             8207       4817      0.02      67      Enzyme and pathway databases                 
                    REBASE                                442        399     <0.01     107      Protein family/group databases               
                    RefSeq                             484892     443192      0.92       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1255       1034     <0.01      92      2D gel databases                             
                    RGD                                  7479       7475      0.01      68      Organism-specific databases                  
                    SGD                                  6638       6559      0.01      71      Organism-specific databases                  
                    Siena-2DPAGE                          102        102     <0.01     118      2D gel databases                             
                    SMART                              155397     118377      0.30      23      Family and domain databases                  
                    SMR                                348639     348639      0.66      13      3D structure databases                       
                    STRING                             205962     205941      0.39      21      Protein-protein interaction databases        
                    SUPFAM                             319141     253663      0.61      15      Family and domain databases                  
                    SWISS-2DPAGE                         1184       1183     <0.01      94      2D gel databases                             
                    TAIR                                10069       9983      0.02      66      Organism-specific databases                  
                    TCDB                                 3459       3450      0.01      86      Protein family/group databases               
                    TIGR                                34203      33432      0.07      47      Genome annotation databases                  
                    TIGRFAMs                           285571     265599      0.54      18      Family and domain databases                  
                    TubercuList                          1751       1715     <0.01      89      Organism-specific databases                  
                    UCD-2DPAGE                            511        502     <0.01     103      2D gel databases                             
                    UCSC                                48623      39632      0.09      44      Genome annotation databases                  
                    UniGene                             92820      84993      0.18      29      Sequence databases                           
                    VectorBase                            455        441     <0.01     106      Genome annotation databases                  
                    World-2DPAGE                          915        904     <0.01      95      2D gel databases                             
                    WormBase                             4619       3799      0.01      77      Organism-specific databases                  
                    Xenbase                              4400       4334      0.01      80      Organism-specific databases                  
                    ZFIN                                 2649       2638      0.01      88      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 126
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.27   Gln (Q) 3.93   Leu (L) 9.67   Ser (S) 6.52
                    Arg (R) 5.53   Glu (E) 6.76   Lys (K) 5.85   Thr (T) 5.33
                    Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.98   Pro (P) 4.69   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4448 entries are encoded on a mitochondrion, and 3593 are encoded on a plasmid.
                    
                    12184 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11620 on chloroplasts, 
                    50 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 70235