Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
UniProtKB/Swiss-Prot protein knowledgebase release 2010_04 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2010_04 of 23-Mar-10 of UniProtKB/Swiss-Prot contains 516081 sequence entries,
                    comprising 181677051 amino acids abstracted from 188261 references. 
                    
                    901 sequences have been added since release 15.15, the sequence data of
                    127 existing entries has been updated and the annotations of
                    81694 entries have been revised.
                    
                    Number of fragments: 8648
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 28958
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        68873   13.3%
                    2: Evidence at transcript level     66865     13%
                    3: Inferred from homology          364459   70.6%
                    4: Predicted                        14343    2.8%
                    5: Uncertain                         1541    0.3%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12071
                    
                    The first twenty species represent 107569 sequences:  20.8 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5243
                    2x: 1700
                    3x:  894
                    4x:  572
                    5x:  423
                    6x:  352
                    7x:  241
                    8x:  205
                    9x:  186
                    10x:  105
                    11- 20x:  590
                    21- 50x:  371
                    51-100x:  176
                    >100x: 1013
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20279  Homo sapiens (Human)
                    2      16240  Mus musculus (Mouse)
                    3       8961  Arabidopsis thaliana (Mouse-ear cress)
                    4       7491  Rattus norvegicus (Rat)
                    5       6569  Saccharomyces cerevisiae (Baker's yeast)
                    6       5753  Bos taurus (Bovine)
                    7       4974  Schizosaccharomyces pombe (Fission yeast)
                    8       4386  Escherichia coli (strain K12)
                    9       4258  Bacillus subtilis
                    10       4176  Dictyostelium discoideum (Slime mold)
                    11       3286  Caenorhabditis elegans
                    12       3227  Xenopus laevis (African clawed frog)
                    13       3065  Drosophila melanogaster (Fruit fly)
                    14       2615  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2373  Oryza sativa subsp. japonica (Rice)
                    16       2209  Pongo abelii (Sumatran orangutan)
                    17       2159  Gallus gallus (Chicken)
                    18       1993  Escherichia coli O157:H7
                    19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
                    20       1773  Haemophilus influenzae
                    21       1768  Salmonella typhimurium
                    22       1668  Escherichia coli O6
                    23       1666  Shigella flexneri
                    24       1624  Mycobacterium tuberculosis
                    25       1531  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1366  Sus scrofa (Pig)
                    27       1341  Salmonella typhi
                    28       1279  Pseudomonas aeruginosa
                    29       1213  Mycobacterium bovis
                    30       1162  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
                    31       1015  Synechocystis sp. (strain PCC 6803)
                    32        996  Yersinia pestis
                    33        991  Archaeoglobus fulgidus
                    34        941  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        923  Staphylococcus aureus (strain N315)
                    37        922  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        912  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        896  Staphylococcus aureus (strain COL)
                    41        894  Staphylococcus aureus (strain MW2)
                    42        888  Staphylococcus aureus (strain MSSA476)
                    43        885  Staphylococcus aureus (strain MRSA252)
                    44        882  Oryctolagus cuniculus (Rabbit)
                    45        879  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    46        879  Salmonella choleraesuis
                    47        869  Shigella sonnei (strain Ss046)
                    48        863  Yersinia pseudotuberculosis
                    49        835  Escherichia coli O9:H4 (strain HS)
                    50        829  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    51        824  Shigella boydii serotype 4 (strain Sb227)
                    52        818  Escherichia coli (strain UTI89 / UPEC)
                    53        817  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    54        814  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    55        800  Shigella dysenteriae serotype 1 (strain Sd197)
                    56        796  Candida albicans (Yeast)
                    57        794  Vibrio parahaemolyticus
                    58        789  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    59        785  Escherichia coli (strain SMS-3-5 / SECEC)
                    60        778  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    61        777  Pasteurella multocida
                    62        773  Aquifex aeolicus
                    63        771  Neurospora crassa
                    64        765  Escherichia coli (strain K12 / DH10B)
                    65        765  Canis familiaris (Dog)
                    66        759  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    67        759  Escherichia coli (strain K12 / BW2952)
                    68        757  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    69        757  Escherichia coli (strain 55989 / EAEC)
                    70        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    71        756  Escherichia coli O8 (strain IAI1)
                    72        756  Staphylococcus epidermidis (strain ATCC 12228)
                    73        751  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    74        750  Escherichia coli (strain SE11)
                    75        750  Shigella flexneri serotype 5b (strain 8401)
                    76        748  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    77        747  Candida glabrata (Yeast) (Torulopsis glabrata)
                    78        742  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    79        738  Streptomyces coelicolor
                    80        738  Photorhabdus luminescens subsp. laumondii
                    81        731  Vibrio vulnificus
                    82        730  Bacillus halodurans
                    83        726  Escherichia coli O81 (strain ED1a)
                    84        722  Bacillus anthracis
                    85        722  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    86        719  Salmonella enteritidis PT4 (strain P125109)
                    87        715  Vibrio vulnificus (strain YJ016)
                    88        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    89        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    90        713  Salmonella paratyphi A (strain AKU_12601)
                    91        712  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    92        711  Staphylococcus aureus (strain NCTC 8325)
                    93        710  Salmonella newport (strain SL254)
                    94        709  Salmonella heidelberg (strain SL476)
                    95        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    96        709  Salmonella agona (strain SL483)
                    97        708  Salmonella schwarzengrund (strain CVM19633)
                    98        706  Escherichia coli O1:K1 / APEC
                    99        699  Salmonella dublin (strain CT_02021853)
                    100        697  Enterobacter sp. (strain 638)
                    101        696  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    102        696  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    103        687  Mycoplasma pneumoniae
                    104        686  Pan troglodytes (Chimpanzee)
                    105        685  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    106        684  Pseudomonas syringae pv. tomato
                    107        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    108        682  Klebsiella pneumoniae (strain 342)
                    109        676  Anabaena sp. (strain PCC 7120)
                    110        670  Pseudomonas putida (strain KT2440)
                    111        666  Yersinia pestis (strain Pestoides F)
                    112        665  Staphylococcus aureus (strain USA300)
                    113        664  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    114        661  Mycobacterium leprae
                    115        658  Rhizobium sp. (strain NGR234)
                    116        653  Serratia proteamaculans (strain 568)
                    117        652  Zea mays (Maize)
                    118        646  Escherichia coli
                    119        645  Bradyrhizobium japonicum
                    120        641  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    121        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    122        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    123        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    124        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    125        620  Shewanella oneidensis
                    126        617  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    127        615  Treponema pallidum
                    128        613  Ralstonia solanacearum (Pseudomonas solanacearum)
                    129        608  Staphylococcus haemolyticus (strain JCSC1435)
                    130        608  Enterobacter sakazakii (strain ATCC BAA-894)
                    131        602  Emericella nidulans (Aspergillus nidulans)
                    132        602  Rhizobium loti (Mesorhizobium loti)
                    133        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    134        600  Methanobacterium thermoautotrophicum
                    135        598  Yersinia pestis bv. Antiqua (strain Angola)
                    136        598  Salmonella paratyphi C (strain RKS4594)
                    137        596  Listeria monocytogenes
                    138        595  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    139        593  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    140        592  Yarrowia lipolytica (Candida lipolytica)
                    141        590  Bacillus cereus (strain ATCC 10987)
                    142        589  Xanthomonas campestris pv. campestris
                    143        588  Listeria innocua
                    144        585  Rickettsia prowazekii
                    145        584  Helicobacter pylori (Campylobacter pylori)
                    146        582  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    147        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    148        579  Neisseria meningitidis serogroup B
                    149        576  Brucella suis
                    150        572  Brucella melitensis
                    151        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    152        567  Bacillus thuringiensis subsp. konkukian
                    153        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    154        562  Buchnera aphidicola subsp. Schizaphis graminum
                    155        560  Bacillus cereus (strain ZK / E33L)
                    156        560  Pseudomonas syringae pv. syringae (strain B728a)
                    157        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    158        556  Neisseria meningitidis serogroup A
                    159        555  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    160        555  Xanthomonas axonopodis pv. citri (Citrus canker)
                    161        553  Vibrio fischeri (strain ATCC 700601 / ES114)
                    162        551  Pseudomonas fluorescens (strain Pf0-1)
                    163        549  Oceanobacillus iheyensis
                    164        545  Caulobacter crescentus (Caulobacter vibrioides)
                    165        545  Clostridium acetobutylicum
                    166        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    167        538  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    168        529  Listeria monocytogenes serotype 4b (strain F2365)
                    169        523  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    170        522  Sodalis glossinidius (strain morsitans)
                    171        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    172        521  Xylella fastidiosa
                    173        519  Streptococcus pneumoniae
                    174        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    175        510  Chromobacterium violaceum
                    176        509  Thermotoga maritima
                    177        509  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    178        507  Bordetella parapertussis
                    179        507  Buchnera aphidicola subsp. Baizongia pistaciae
                    180        507  Pseudomonas aeruginosa (strain PA7)
                    181        505  Bordetella pertussis
                    182        504  Haemophilus ducreyi
                    183        504  Geobacillus kaustophilus
                    184        503  Staphylococcus aureus (strain Newman)
                    185        500  Pseudomonas entomophila (strain L48)
                    186        498  Brucella abortus
                    187        497  Rickettsia conorii
                    188        496  Bacillus clausii (strain KSM-K16)
                    189        492  Haemophilus influenzae (strain 86-028NP)
                    190        492  Deinococcus radiodurans
                    191        492  Aspergillus fumigatus (Sartorya fumigata)
                    192        490  Xanthomonas campestris pv. campestris (strain 8004)
                    193        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    194        490  Clostridium perfringens
                    195        488  Bacillus amyloliquefaciens (strain FZB42)
                    196        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    197        487  Shewanella sp. (strain MR-7)
                    198        484  Pseudomonas aeruginosa (strain LESB58)
                    199        484  Shewanella sp. (strain MR-4)
                    200        483  Mannheimia succiniciproducens (strain MBEL55E)
                    201        483  Mycoplasma genitalium
                    202        483  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    203        483  Corynebacterium glutamicum (Brevibacterium flavum)
                    204        482  Caenorhabditis briggsae
                    205        482  Streptomyces avermitilis
                    206        480  Proteus mirabilis (strain HI4320)
                    207        480  Oryza sativa subsp. indica (Rice)
                    208        478  Methanosarcina acetivorans
                    209        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    210        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    211        472  Pseudomonas putida (strain F1 / ATCC 700007)
                    212        472  Brucella abortus (strain 2308)
                    213        472  Thermosynechococcus elongatus (strain BP-1)
                    214        468  Acinetobacter sp. (strain ADP1)
                    215        468  Enterococcus faecalis (Streptococcus faecalis)
                    216        465  Pyrococcus horikoshii
                    217        465  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    218        465  Pseudomonas putida (strain GB-1)
                    219        464  Rhodopseudomonas palustris
                    220        464  Shewanella frigidimarina (strain NCIMB 400)
                    221        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    222        462  Shewanella sp. (strain ANA-3)
                    223        461  Burkholderia mallei (Pseudomonas mallei)
                    224        460  Ralstonia eutropha  (Cupriavidus necator 
                    225        458  Lactobacillus plantarum
                    226        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    227        457  Pyrococcus abyssi
                    228        457  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    229        457  Methanosarcina mazei (Methanosarcina frisia)
                    230        455  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    231        454  Staphylococcus aureus (strain JH1)
                    232        453  Rickettsia felis (Rickettsia azadi)
                    233        453  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    234        452  Shewanella baltica (strain OS185)
                    235        452  Pseudomonas putida (strain W619)
                    236        452  Halobacterium salinarium (Halobacterium halobium)
                    237        449  Streptococcus mutans
                    238        448  Staphylococcus aureus (strain JH9)
                    239        448  Thermoanaerobacter tengcongensis
                    240        447  Methylococcus capsulatus
                    241        447  Ovis aries (Sheep)
                    242        447  Aeromonas salmonicida (strain A449)
                    243        446  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    244        445  Vibrio fischeri (strain MJ11)
                    245        444  Pseudomonas mendocina (strain ymp)
                    246        443  Hahella chejuensis (strain KCTC 2396)
                    247        441  Streptococcus pyogenes serotype M6
                    248        441  Dechloromonas aromatica (strain RCB)
                    249        440  Pyrococcus furiosus
                    250        439  Nicotiana tabacum (Common tobacco)
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18197 (  4%)
                    Bacteria         323350 ( 63%)
                    Eukaryota        159679 ( 31%)
                    Viruses           14855 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20280 ( 13%)           (  4%)
                    Other Mammalia         44620 ( 28%)           (  9%)
                    Other Vertebrata       16048 ( 10%)           (  3%)
                    Viridiplantae          28902 ( 18%)           (  6%)
                    Fungi                  25289 ( 16%)           (  5%)
                    Insecta                 7796 (  5%)           (  2%)
                    Nematoda                4046 (  3%)           (  1%)
                    Other                  12698 (  8%)           (  2%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8384             1001-1100     3466
                    51- 100   39906             1101-1200     2394
                    101- 150   55816             1201-1300     1913
                    151- 200   55918             1301-1400     1784
                    201- 250   54474             1401-1500     1419
                    251- 300   47909             1501-1600      633
                    301- 350   48349             1601-1700      496
                    351- 400   41304             1701-1800      409
                    401- 450   33777             1801-1900      390
                    451- 500   27141             1901-2000      321
                    501- 550   19209             2001-2100      196
                    551- 600   13697             2101-2200      261
                    601- 650   11474             2201-2300      274
                    651- 700    8163             2301-2400      168
                    701- 750    6806             2401-2500      129
                    751- 800    4829             >2500         1004
                    801- 850    4130
                    851- 900    4743
                    901- 950    3616
                    951-1000    2531
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 352 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2057
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  663
                    2x:  280
                    3x:  144
                    4x:  102
                    5x:   90
                    6x:   61
                    7x:   35
                    8x:   35
                    9x:   36
                    10x:   28
                    11- 20x:  164
                    21- 50x:  164
                    51-100x:   97
                    >100x:  158
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        17808   Journal of Biological Chemistry
                    2         8263   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5004   Journal of Bacteriology
                    4         4494   Gene
                    5         4488   Biochemical and Biophysical Research Communications
                    6         4294   Nucleic Acids Research
                    7         3942   FEBS Letters
                    8         3833   Biochemistry
                    9         3720   The EMBO Journal
                    10         3391   Molecular and Cellular Biology
                    11         3210   Nature
                    12         3086   European Journal of Biochemistry
                    13         3016   Journal of Molecular Biology
                    14         2963   Biochimica et Biophysica Acta
                    15         2661   Cell
                    16         2471   Genomics
                    17         2165   Biochemical Journal
                    18         2108   Science
                    19         2024   Journal of Virology
                    20         1753   Molecular Microbiology
                    21         1557   Journal of Cell Biology
                    22         1491   Plant Molecular Biology
                    23         1356   Genes and Development
                    24         1347   Virology
                    25         1308   Nature Genetics
                    26         1305   Human Molecular Genetics
                    27         1303   Molecular and General Genetics
                    28         1287   Plant Physiology
                    29         1202   The American Journal of Human Genetics
                    30         1169   Oncogene
                    31         1154   Journal of Biochemistry
                    32         1146   Development
                    33         1082   Human Mutation
                    34         1008   Molecular Biology of the Cell
                    35         1005   Journal of Immunology
                    36          974   Genetics
                    37          881   Structure
                    38          875   Infection and Immunity
                    39          868   Journal of General Virology
                    40          844   The Plant Cell
                    41          817   Archives of Biochemistry and Biophysics
                    42          800   Molecular Cell
                    43          791   Blood
                    44          756   Yeast
                    45          748   Microbiology
                    46          725   The Plant Journal
                    47          721   Journal of Cell Science
                    48          720   Developmental Biology
                    49          664   Cancer Research
                    50          649   FEMS Microbiology Letters
                    51          642   Current Biology
                    52          590   Mechanisms of Development
                    53          590   Human Genetics
                    54          586   Nature Structural Biology
                    55          546   Acta Crystallographica, Section D
                    56          543   Protein Science
                    57          530   Applied and Environmental Microbiology
                    58          529   Journal of Neuroscience
                    59          523   Current Genetics
                    60          505   Toxicon
                    61          500   Journal of Clinical Investigation
                    62          498   Neuron
                    63          469   Mammalian Genome
                    64          457   American Journal of Physiology
                    65          449   Immunogenetics
                    66          441   The Journal of Experimental Medicine
                    67          437   Molecular Endocrinology
                    68          419   Molecular and Biochemical Parasitology
                    69          407   Journal of Neurochemistry
                    70          399   The Journal of Clinical Endocrinology and Metabolism
                    71          385   Endocrinology
                    72          376   Journal of Molecular Evolution
                    73          365   DNA and Cell Biology
                    74          363   Proteins
                    75          356   DNA Sequence
                    76          353   Molecular Biology and Evolution
                    77          353   Bioscience, Biotechnology, and Biochemistry
                    78          347   Journal of Medical Genetics
                    79          319   Tissue Antigens
                    80          314   Brain Research. Molecular Brain Research
                    81          296   Plant and Cell Physiology
                    82          292   Peptides
                    83          291   Experimental Cell Research
                    84          289   Biological Chemistry Hoppe-Seyler
                    85          289   Comparative Biochemistry and Physiology
                    86          289   Nature Cell Biology
                    87          279   Antimicrobial Agents and Chemotherapy
                    88          277   Journal of Investigative Dermatology
                    89          275   Cytogenetics and Cell Genetics
                    90          268   Molecular Pharmacology
                    91          255   Biology of Reproduction
                    92          247   Journal of General Microbiology
                    93          246   Genome Research
                    94          241   Neurology
                    95          240   Developmental Dynamics
                    96          240   Developmental Cell
                    97          239   RNA
                    98          231   Virus Research
                    99          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    100          205   DNA Research
                    101          205   Planta
                    102          203   European Journal of Immunology
                    103          202   Molecular Plant-Microbe Interactions
                    104          200   Biochimie
                    105          199   Annals of Neurology
                    106          193   European Journal of Human Genetics
                    107          193   Genes to Cells
                    108          190   Eukaryotic cell
                    109          184   Immunity
                    110          182   Journal of Human Genetics
                    111          175   The New England Journal of Medicine
                    112          175   Molecular and Cellular Endocrinology
                    113          171   Nature Structural and Molecular Biology
                    114          167   Investigative Ophthalmology and Visual Science
                    115          164   Archives of Microbiology
                    116          163   American Journal of Medical Genetics
                    117          163   Molecular Phylogenetics and Evolution
                    118          159   Insect Biochemistry and Molecular Biology
                    119          159   DNA
                    120          158   EMBO Reports
                    121          153   Hemoglobin
                    122          153   The FASEB Journal
                    123          153   The FEBS Journal
                    124          151   Bioorganicheskaia Khimiia
                    125          150   Molecular Immunology
                    126          149   Diabetes
                    127          148   Molecular Reproduction and Development
                    128          145   Archives of Virology
                    129          142   Glycobiology
                    130          142   Clinical Genetics
                    131          136   General and Comparative Endocrinology
                    132          136   International Journal of Cancer
                    133          136   Animal Genetics
                    134          136   Molecular Genetics and Metabolism
                    135          132   Molecular and Cellular Neuroscience
                    136          130   British Journal of Haematology
                    137          128   Journal of the American Chemical Society
                    138          128   Journal of Cellular Biochemistry
                    139          125   Biological Chemistry
                    140          123   American Journal of Medical Genetics. Part A
                    141          123   Molecular Genetics and Genomics
                    142          121   BMC Genomics
                    143          121   Nature Immunology
                    144          120   Journal of Lipid Research
                    145          120   Agricultural and Biological Chemistry
                    146          114   Thrombosis and Haemostasis
                    147          114   Proteomics
                    148          113   Circulation Research
                    149          113   Neuroscience Letters
                    150          113   Journal of Protein Chemistry
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       919812                 1.78                                         
                    Journal                            728205     387001      1.41       1                                 
                    Submitted to EMBL/GenBank/DDBJ     178910     165552      0.35       2                                 
                    Submitted to other databases        10638       9232      0.02       3                                 
                    Book citation                         635        621     <0.01       4                                 
                    Plant Gene Register                   560        548     <0.01       5                                 
                    Thesis                                395        392     <0.01       6                                 
                    Unpublished observations              294        290     <0.01       7                                 
                    Patent                                169        167     <0.01       8                                 
                    Worm Breeder's Gazette                  6          6     <0.01       9                                 
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 287254
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2181114                 4.23                                         
                    ALLERGEN                              460        460     <0.01      26                                 
                    ALTERNATIVE PRODUCTS                18709      18709      0.04      12                                 
                    BIOPHYSICOCHEMICAL PROPERTIES        2970       2970      0.01      22                                 
                    BIOTECHNOLOGY                         269        267     <0.01      28                                 
                    CATALYTIC ACTIVITY                 218878     199657      0.42       5                                 
                    CAUTION                              6847       6707      0.01      19                                 
                    COFACTOR                            99437      91331      0.19       7                                 
                    DEVELOPMENTAL STAGE                  8789       8789      0.02      16                                 
                    DISEASE                              4295       2895      0.01      20                                 
                    DISRUPTION PHENOTYPE                 2522       2522     <0.01      23                                 
                    DOMAIN                              31688      28031      0.06      10                                 
                    ENZYME REGULATION                    7820       7820      0.02      18                                 
                    FUNCTION                           384399     368359      0.74       2                                 
                    INDUCTION                           11597      11597      0.02      15                                 
                    INTERACTION                         12355      12355      0.02      14                                 
                    MASS SPECTROMETRY                    4284       3239      0.01      21                                 
                    MISCELLANEOUS                       30104      27840      0.06      11                                 
                    PATHWAY                            126492     115545      0.25       6                                 
                    PHARMACEUTICAL                         83         83     <0.01      29                                 
                    POLYMORPHISM                          779        745     <0.01      24                                 
                    PTM                                 35399      28677      0.07       8                                 
                    RNA EDITING                           603        603     <0.01      25                                 
                    SEQUENCE CAUTION                    12775      12775      0.02      13                                 
                    SIMILARITY                         600084     491428      1.16       1                                 
                    SUBCELLULAR LOCATION               297232     292169      0.58       3                                 
                    SUBUNIT                            220758     220758      0.43       4                                 
                    TISSUE SPECIFICITY                  32772      32772      0.06       9                                 
                    TOXIC DOSE                            419        408     <0.01      27                                 
                    WEB RESOURCE                         8295       6585      0.02      17                                 
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3212734                 6.23                                         
                    ACT_SITE                           128238      76541      0.25       9                                 
                    BINDING                            200890      57568      0.39       4                                 
                    CA_BIND                              3653       1480      0.01      35                                 
                    CARBOHYD                            96579      24736      0.19      13                                 
                    CHAIN                              522566     511327      1.01       1                                 
                    COILED                              18188      12268      0.04      26                                 
                    COMPBIAS                            48980      25569      0.09      18                                 
                    CONFLICT                           115817      40622      0.22      10                                 
                    CROSSLNK                             4810       3118      0.01      34                                 
                    DISULFID                            94876      25297      0.18      14                                 
                    DNA_BIND                            10899      10032      0.02      29                                 
                    DOMAIN                             143253      85547      0.28       6                                 
                    HELIX                              130288      13628      0.25       8                                 
                    INIT_MET                            14814      14814      0.03      27                                 
                    LIPID                               10525       6702      0.02      30                                 
                    METAL                              272448      67103      0.53       3                                 
                    MOD_RES                            178168      58525      0.35       5                                 
                    MOTIF                               32067      20651      0.06      22                                 
                    MUTAGEN                             30364       7222      0.06      24                                 
                    NON_CONS                             1555        640     <0.01      36                                 
                    NON_STD                               348        273     <0.01      38                                 
                    NON_TER                             11687       8907      0.02      28                                 
                    NP_BIND                            104665      68293      0.20      12                                 
                    PEPTIDE                              8545       5471      0.02      32                                 
                    PROPEP                              10480       8835      0.02      31                                 
                    REGION                              91974      50606      0.18      15                                 
                    REPEAT                              87641      12947      0.17      16                                 
                    SIGNAL                              33861      33851      0.07      21                                 
                    SITE                                36789      21767      0.07      20                                 
                    STRAND                             130834      12738      0.25       7                                 
                    TOPO_DOM                           115279      23711      0.22      11                                 
                    TRANSIT                              6477       6391      0.01      33                                 
                    TRANSMEM                           336954      68914      0.65       2                                 
                    TURN                                31088      10764      0.06      23                                 
                    UNSURE                               1111        354     <0.01      37                                 
                    VAR_SEQ                             38771      16651      0.08      19                                 
                    VARIANT                             79241      16448      0.15      17                                 
                    ZN_FING                             28011      12221      0.05      25                                 
                    
                    Total number of feature keys: 38
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               13010686                25.21                                                           
                    2DBase-Ecoli                           85         85     <0.01     112      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     110      2D gel databases                             
                    AGD                                   823        817     <0.01      89      Organism-specific databases                  
                    ANU-2DPAGE                             23         23     <0.01     120      2D gel databases                             
                    ArachnoServer                         463        459     <0.01      97      Organism-specific databases                  
                    ArrayExpress                        58096      58096      0.11      38      Gene expression databases                    
                    Bgee                                37672      37664      0.07      44      Gene expression databases                    
                    BindingDB                             297        297     <0.01     104      Other                                        
                    BioCyc                             160600     147725      0.31      21      Enzyme and pathway databases                 
                    BRENDA                              65165      62369      0.13      35      Enzyme and pathway databases                 
                    CAZy                                 5698       5077      0.01      67      Protein family/group databases               
                    CGD                                   556        551     <0.01      94      Organism-specific databases                  
                    CleanEx                             30209      29562      0.06      46      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                     59         59     <0.01     115      2D gel databases                             
                    ConoServer                            613        588     <0.01      93      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     114      2D gel databases                             
                    CTD                                 63544      62974      0.12      37      Organism-specific databases                  
                    CYGD                                 6629       6534      0.01      66      Organism-specific databases                  
                    dictyBase                            4299       4175      0.01      74      Organism-specific databases                  
                    DIP                                 11508      11403      0.02      57      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     100      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     150        150     <0.01     109      2D gel databases                             
                    DrugBank                             5317       1626      0.01      69      Other                                        
                    EchoBASE                             4162       4134      0.01      76      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     103      2D gel databases                             
                    EcoGene                              4364       4362      0.01      73      Organism-specific databases                  
                    eggNOG                             216560     216560      0.42      18      Phylogenomic databases                       
                    EMBL                               850129     506366      1.65       3      Sequence databases                           
                    Ensembl                             90126      69660      0.17      28      Genome annotation databases                  
                    euHCVdb                                55         44     <0.01     116      Organism-specific databases                  
                    EuPathDB                              231        231     <0.01     108      Organism-specific databases                  
                    FlyBase                              5520       5144      0.01      68      Organism-specific databases                  
                    Gene3D                             235666     193591      0.46      17      Family and domain databases                  
                    GeneCards                           21063      19807      0.04      50      Organism-specific databases                  
                    GeneDB_Spombe                        4976       4931      0.01      71      Organism-specific databases                  
                    GeneFarm                             2691       2676      0.01      82      Organism-specific databases                  
                    GeneID                             468009     448891      0.91       6      Genome annotation databases                  
                    Genevestigator                      64469      64469      0.12      36      Gene expression databases                    
                    GenoList                             7040       7028      0.01      64      Organism-specific databases                  
                    GenomeReviews                      376860     356814      0.73       9      Genome annotation databases                  
                    GermOnline                          41924      41324      0.08      43      Gene expression databases                    
                    GlycoSuiteDB                          280        280     <0.01     105      PTM databases                                
                    GO                                2158479     482039      4.18       1      Ontologies                                   
                    Gramene                              4293       4293      0.01      75      Organism-specific databases                  
                    H-InvDB                             11798      11059      0.02      56      Organism-specific databases                  
                    HAMAP                              307287     307143      0.60      15      Family and domain databases                  
                    HGNC                                19561      19387      0.04      51      Organism-specific databases                  
                    HOGENOM                            359491     359491      0.70      10      Phylogenomic databases                       
                    HOVERGEN                            74308      74308      0.14      31      Phylogenomic databases                       
                    HPA                                  8704       6562      0.02      61      Organism-specific databases                  
                    HSC-2DPAGE                             85         85     <0.01     113      2D gel databases                             
                    HSSP                                28925      28925      0.06      47      3D structure databases                       
                    InParanoid                          65903      65903      0.13      32      Phylogenomic databases                       
                    IntAct                              21850      21850      0.04      49      Protein-protein interaction databases        
                    InterPro                          1584866     487577      3.07       2      Family and domain databases                  
                    IPI                                 88455      63459      0.17      29      Sequence databases                           
                    KEGG                               438996     417268      0.85       8      Genome annotation databases                  
                    LegioList                             760        758     <0.01      90      Organism-specific databases                  
                    Leproma                               664        661     <0.01      92      Organism-specific databases                  
                    MaizeGDB                              472        467     <0.01      96      Organism-specific databases                  
                    MEROPS                               9957       9635      0.02      59      Protein family/group databases               
                    MGI                                 16118      16067      0.03      53      Organism-specific databases                  
                    MIM                                 15872      12488      0.03      54      Organism-specific databases                  
                    MINT                                10883      10883      0.02      58      Protein-protein interaction databases        
                    NextBio                             48717      48717      0.09      41      Other                                        
                    NMPDR                              130162     130158      0.25      24      Genome annotation databases                  
                    OGP                                   377        377     <0.01     102      2D gel databases                             
                    OMA                                353472     353472      0.68      11      Phylogenomic databases                       
                    Orphanet                             3674       2131      0.01      80      Organism-specific databases                  
                    OrthoDB                             55566      55566      0.11      39      Phylogenomic databases                       
                    PANTHER                            185009     169820      0.36      20      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      72      Enzyme and pathway databases                 
                    PDB                                 65698      15499      0.13      34      3D structure databases                       
                    PDBsum                              65698      15499      0.13      33      3D structure databases                       
                    PeptideAtlas                         5167       5167      0.01      70      Proteomic databases                          
                    PeroxiBase                            677        665     <0.01      91      Protein family/group databases               
                    Pfam                               662051     467369      1.28       4      Family and domain databases                  
                    PharmGKB                            15809      15798      0.03      55      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     107      2D gel databases                             
                    PhosphoSite                         19305      19305      0.04      52      PTM databases                                
                    PhosSite                              266        266     <0.01     106      PTM databases                                
                    PhylomeDB                          121240     121240      0.23      25      Phylogenomic databases                       
                    PIR                                115171     105196      0.22      26      Sequence databases                           
                    PIRSF                               82141      82141      0.16      30      Family and domain databases                  
                    PMAP-CutDB                           1394       1394     <0.01      85      Other                                        
                    PMMA-2DPAGE                            52         52     <0.01     117      2D gel databases                             
                    PptaseDB                               34         34     <0.01     118      Protein family/group databases               
                    PRIDE                               52085      52085      0.10      40      Proteomic databases                          
                    PRINTS                             136405     117988      0.26      23      Family and domain databases                  
                    ProDom                              27785      27456      0.05      48      Family and domain databases                  
                    ProMEX                                440        440     <0.01      98      Proteomic databases                          
                    PROSITE                            457145     291359      0.89       7      Family and domain databases                  
                    ProtClustDB                        324023     324023      0.63      13      Phylogenomic databases                       
                    PseudoCAP                            1218       1209     <0.01      86      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     119      2D gel databases                             
                    Reactome                             7331       4257      0.01      63      Enzyme and pathway databases                 
                    REBASE                                379        358     <0.01     101      Protein family/group databases               
                    RefSeq                             488608     449176      0.95       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1029        941     <0.01      88      2D gel databases                             
                    RGD                                  7372       7368      0.01      62      Organism-specific databases                  
                    SGD                                  6641       6550      0.01      65      Organism-specific databases                  
                    Siena-2DPAGE                          103        103     <0.01     111      2D gel databases                             
                    SMART                              142018     109563      0.28      22      Family and domain databases                  
                    SMR                                345947     345947      0.67      12      3D structure databases                       
                    STRING                             203599     203583      0.39      19      Protein-protein interaction databases        
                    SUPFAM                             312097     247152      0.60      14      Family and domain databases                  
                    SWISS-2DPAGE                         1183       1183     <0.01      87      2D gel databases                             
                    TAIR                                 9043       8932      0.02      60      Organism-specific databases                  
                    TCDB                                 3295       3254      0.01      81      Protein family/group databases               
                    TIGR                                33973      33206      0.07      45      Genome annotation databases                  
                    TIGRFAMs                           280343     261617      0.54      16      Family and domain databases                  
                    TubercuList                          1650       1614     <0.01      84      Organism-specific databases                  
                    UCSC                                48497      39524      0.09      42      Genome annotation databases                  
                    UniGene                             92047      81084      0.18      27      Sequence databases                           
                    VectorBase                            421        407     <0.01      99      Genome annotation databases                  
                    World-2DPAGE                          507        507     <0.01      95      2D gel databases                             
                    WormBase                             3822       3737      0.01      79      Organism-specific databases                  
                    WormPep                              4057       3277      0.01      77      Organism-specific databases                  
                    Xenbase                              3941       3869      0.01      78      Organism-specific databases                  
                    ZFIN                                 2560       2549     <0.01      83      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 120
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.28   Gln (Q) 3.94   Leu (L) 9.67   Ser (S) 6.50
                    Arg (R) 5.53   Glu (E) 6.76   Lys (K) 5.85   Thr (T) 5.32
                    Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.43   Trp (W) 1.07
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.91
                    Cys (C) 1.36   Ile (I) 5.99   Pro (P) 4.68   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4446 entries are encoded on a mitochondrion, and 3556 are encoded on a plasmid.
                    
                    12175 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11617 on chloroplasts, 
                    44 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 68596