Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
UniProtKB/Swiss-Prot protein knowledgebase release 2010_08 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2010_08 of 13-Jul-10 of UniProtKB/Swiss-Prot contains 518415 sequence entries,
                    comprising 182829264 amino acids abstracted from 190192 references. 
                    
                    646 sequences have been added since release 2010_07, the sequence data of
                    118 existing entries has been updated and the annotations of
                    262208 entries have been revised.
                    
                    Number of fragments: 8708
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 29306
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        70117   13.5%
                    2: Evidence at transcript level     67013   12.9%
                    3: Inferred from homology          365349   70.5%
                    4: Predicted                        14328    2.8%
                    5: Uncertain                         1608    0.3%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12146
                    
                    The first twenty species represent 108194 sequences:  20.9 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5264
                    2x: 1724
                    3x:  898
                    4x:  577
                    5x:  426
                    6x:  351
                    7x:  245
                    8x:  209
                    9x:  187
                    10x:  106
                    11- 20x:  596
                    21- 50x:  372
                    51-100x:  173
                    >100x: 1018
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20291  Homo sapiens (Human)
                    2      16301  Mus musculus (Mouse)
                    3       9160  Arabidopsis thaliana (Mouse-ear cress)
                    4       7519  Rattus norvegicus (Rat)
                    5       6577  Saccharomyces cerevisiae (Baker's yeast)
                    6       5781  Bos taurus (Bovine)
                    7       4975  Schizosaccharomyces pombe (Fission yeast)
                    8       4429  Escherichia coli (strain K12)
                    9       4254  Bacillus subtilis
                    10       4245  Dictyostelium discoideum (Slime mold)
                    11       3293  Caenorhabditis elegans
                    12       3244  Xenopus laevis (African clawed frog)
                    13       3081  Drosophila melanogaster (Fruit fly)
                    14       2641  Danio rerio (Zebrafish) (Brachydanio rerio)
                    15       2474  Oryza sativa subsp. japonica (Rice)
                    16       2211  Pongo abelii (Sumatran orangutan)
                    17       2170  Gallus gallus (Chicken)
                    18       1993  Escherichia coli O157:H7
                    19       1782  Methanocaldococcus jannaschii (Methanococcus jannaschii)
                    20       1773  Haemophilus influenzae
                    21       1772  Salmonella typhimurium
                    22       1668  Shigella flexneri
                    23       1667  Escherichia coli O6
                    24       1666  Mycobacterium tuberculosis
                    25       1542  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1370  Sus scrofa (Pig)
                    27       1342  Salmonella typhi
                    28       1282  Pseudomonas aeruginosa
                    29       1213  Mycobacterium bovis
                    30       1162  Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
                    31       1015  Synechocystis sp. (strain PCC 6803)
                    32        997  Yersinia pestis
                    33        991  Archaeoglobus fulgidus
                    34        942  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        913  Rhizobium meliloti (Sinorhizobium meliloti)
                    39        909  Acanthamoeba polyphaga mimivirus (APMV)
                    40        897  Staphylococcus aureus (strain COL)
                    41        895  Staphylococcus aureus (strain MW2)
                    42        889  Staphylococcus aureus (strain MSSA476)
                    43        886  Staphylococcus aureus (strain MRSA252)
                    44        882  Oryctolagus cuniculus (Rabbit)
                    45        881  Salmonella choleraesuis
                    46        879  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    47        870  Shigella sonnei (strain Ss046)
                    48        864  Yersinia pseudotuberculosis
                    49        835  Escherichia coli O9:H4 (strain HS)
                    50        829  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    51        825  Shigella boydii serotype 4 (strain Sb227)
                    52        821  Ashbya gossypii (Yeast) (Eremothecium gossypii)
                    53        817  Escherichia coli (strain UTI89 / UPEC)
                    54        814  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    55        802  Shigella dysenteriae serotype 1 (strain Sd197)
                    56        800  Candida albicans (Yeast)
                    57        794  Vibrio parahaemolyticus
                    58        793  Kluyveromyces lactis (Yeast) (Candida sphaerica)
                    59        785  Escherichia coli (strain SMS-3-5 / SECEC)
                    60        778  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    61        777  Pasteurella multocida
                    62        775  Neurospora crassa
                    63        773  Aquifex aeolicus
                    64        766  Canis familiaris (Dog) (Canis lupus familiaris)
                    65        765  Escherichia coli (strain K12 / DH10B)
                    66        759  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    67        759  Escherichia coli (strain K12 / MC4100 / BW2952)
                    68        757  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    69        757  Escherichia coli (strain 55989 / EAEC)
                    70        757  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    71        756  Escherichia coli O8 (strain IAI1)
                    72        756  Staphylococcus epidermidis (strain ATCC 12228)
                    73        751  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    74        750  Candida glabrata (Yeast) (Torulopsis glabrata)
                    75        750  Escherichia coli (strain SE11)
                    76        750  Shigella flexneri serotype 5b (strain 8401)
                    77        748  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    78        742  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    79        738  Streptomyces coelicolor
                    80        738  Photorhabdus luminescens subsp. laumondii
                    81        731  Vibrio vulnificus
                    82        730  Bacillus halodurans
                    83        726  Escherichia coli O81 (strain ED1a)
                    84        723  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    85        722  Bacillus anthracis
                    86        720  Salmonella enteritidis PT4 (strain P125109)
                    87        715  Vibrio vulnificus (strain YJ016)
                    88        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    89        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    90        713  Salmonella paratyphi A (strain AKU_12601)
                    91        712  Staphylococcus aureus (strain NCTC 8325)
                    92        712  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    93        711  Salmonella newport (strain SL254)
                    94        710  Salmonella heidelberg (strain SL476)
                    95        710  Salmonella agona (strain SL483)
                    96        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    97        709  Salmonella schwarzengrund (strain CVM19633)
                    98        706  Escherichia coli O1:K1 / APEC
                    99        705  Emericella nidulans (Aspergillus nidulans)
                    100        700  Salmonella dublin (strain CT_02021853)
                    101        698  Enterobacter sp. (strain 638)
                    102        697  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    103        697  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        687  Mycoplasma pneumoniae
                    105        685  Pan troglodytes (Chimpanzee)
                    106        685  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    107        684  Klebsiella pneumoniae (strain 342)
                    108        684  Pseudomonas syringae pv. tomato
                    109        682  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    110        677  Anabaena sp. (strain PCC 7120)
                    111        671  Pseudomonas putida (strain KT2440)
                    112        666  Staphylococcus aureus (strain USA300)
                    113        666  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    114        666  Yersinia pestis (strain Pestoides F)
                    115        662  Mycobacterium leprae
                    116        658  Rhizobium sp. (strain NGR234)
                    117        655  Serratia proteamaculans (strain 568)
                    118        653  Zea mays (Maize)
                    119        645  Escherichia coli
                    120        645  Bradyrhizobium japonicum
                    121        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    122        638  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    123        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    124        635  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    125        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    126        620  Shewanella oneidensis
                    127        617  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    128        615  Treponema pallidum
                    129        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    130        608  Staphylococcus haemolyticus (strain JCSC1435)
                    131        608  Enterobacter sakazakii (strain ATCC BAA-894)
                    132        603  Rhizobium loti (Mesorhizobium loti)
                    133        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    134        601  Methanobacterium thermoautotrophicum
                    135        599  Salmonella paratyphi C (strain RKS4594)
                    136        598  Yersinia pestis bv. Antiqua (strain Angola)
                    137        597  Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
                    138        596  Listeria monocytogenes
                    139        596  Yarrowia lipolytica (Candida lipolytica)
                    140        595  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    141        590  Bacillus cereus (strain ATCC 10987)
                    142        590  Xanthomonas campestris pv. campestris
                    143        588  Listeria innocua
                    144        585  Rickettsia prowazekii
                    145        584  Helicobacter pylori (Campylobacter pylori)
                    146        584  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    147        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    148        579  Neisseria meningitidis serogroup B
                    149        576  Brucella suis
                    150        572  Brucella melitensis
                    151        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    152        570  Aspergillus fumigatus (Sartorya fumigata)
                    153        569  Bacillus thuringiensis subsp. konkukian
                    154        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    155        562  Buchnera aphidicola subsp. Schizaphis graminum
                    156        560  Bacillus cereus (strain ZK / E33L)
                    157        560  Pseudomonas syringae pv. syringae (strain B728a)
                    158        557  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    159        556  Neisseria meningitidis serogroup A
                    160        555  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    161        555  Xanthomonas axonopodis pv. citri (Citrus canker)
                    162        553  Vibrio fischeri (strain ATCC 700601 / ES114)
                    163        551  Pseudomonas fluorescens (strain Pf0-1)
                    164        549  Oceanobacillus iheyensis
                    165        545  Caulobacter crescentus (Caulobacter vibrioides)
                    166        545  Clostridium acetobutylicum
                    167        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    168        538  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    169        529  Listeria monocytogenes serotype 4b (strain F2365)
                    170        524  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    171        522  Sodalis glossinidius (strain morsitans)
                    172        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    173        521  Xylella fastidiosa
                    174        519  Streptococcus pneumoniae
                    175        515  Caenorhabditis briggsae
                    176        512  Chromobacterium violaceum
                    177        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    178        510  Thermotoga maritima
                    179        509  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    180        507  Bordetella parapertussis
                    181        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    182        507  Pseudomonas aeruginosa (strain PA7)
                    183        505  Bordetella pertussis
                    184        504  Staphylococcus aureus (strain Newman)
                    185        504  Haemophilus ducreyi
                    186        504  Geobacillus kaustophilus
                    187        500  Pseudomonas entomophila (strain L48)
                    188        498  Brucella abortus
                    189        497  Rickettsia conorii
                    190        497  Deinococcus radiodurans
                    191        496  Bacillus clausii (strain KSM-K16)
                    192        494  Oryza sativa subsp. indica (Rice)
                    193        492  Haemophilus influenzae (strain 86-028NP)
                    194        490  Xanthomonas campestris pv. campestris (strain 8004)
                    195        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    196        490  Clostridium perfringens
                    197        488  Bacillus amyloliquefaciens (strain FZB42)
                    198        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    199        487  Shewanella sp. (strain MR-7)
                    200        485  Corynebacterium glutamicum (Brevibacterium flavum)
                    201        484  Pseudomonas aeruginosa (strain LESB58)
                    202        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    203        484  Shewanella sp. (strain MR-4)
                    204        483  Mannheimia succiniciproducens (strain MBEL55E)
                    205        483  Mycoplasma genitalium
                    206        482  Streptomyces avermitilis
                    207        481  Proteus mirabilis (strain HI4320)
                    208        479  Methanosarcina acetivorans
                    209        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    210        472  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    211        472  Pseudomonas putida (strain F1 / ATCC 700007)
                    212        472  Brucella abortus (strain 2308)
                    213        472  Thermosynechococcus elongatus (strain BP-1)
                    214        469  Acinetobacter sp. (strain ADP1)
                    215        469  Enterococcus faecalis (Streptococcus faecalis)
                    216        465  Pyrococcus horikoshii
                    217        465  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    218        465  Pseudomonas putida (strain GB-1)
                    219        464  Rhodopseudomonas palustris
                    220        464  Shewanella frigidimarina (strain NCIMB 400)
                    221        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    222        462  Shewanella sp. (strain ANA-3)
                    223        461  Burkholderia mallei (Pseudomonas mallei)
                    224        460  Ralstonia eutropha  (Cupriavidus necator 
                    225        459  Lactobacillus plantarum
                    226        457  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    227        457  Pyrococcus abyssi
                    228        457  Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
                    229        457  Methanosarcina mazei (Methanosarcina frisia)
                    230        456  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    231        455  Staphylococcus aureus (strain JH1)
                    232        453  Rickettsia felis (Rickettsia azadi)
                    233        453  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    234        452  Shewanella baltica (strain OS185)
                    235        452  Pseudomonas putida (strain W619)
                    236        452  Halobacterium salinarium (Halobacterium halobium)
                    237        449  Staphylococcus aureus (strain JH9)
                    238        449  Streptococcus mutans
                    239        448  Thermoanaerobacter tengcongensis
                    240        447  Methylococcus capsulatus
                    241        447  Ovis aries (Sheep)
                    242        447  Aeromonas salmonicida (strain A449)
                    243        446  Vibrio fischeri (strain MJ11)
                    244        446  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    245        444  Pseudomonas mendocina (strain ymp)
                    246        443  Hahella chejuensis (strain KCTC 2396)
                    247        443  Dechloromonas aromatica (strain RCB)
                    248        441  Streptococcus pyogenes serotype M6
                    249        440  Pyrococcus furiosus
                    250        439  Mycobacterium paratuberculosis
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18201 (  4%)
                    Bacteria         323640 ( 62%)
                    Eukaryota        161696 ( 31%)
                    Viruses           14878 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20292 ( 13%)           (  4%)
                    Other Mammalia         44812 ( 28%)           (  9%)
                    Other Vertebrata       16186 ( 10%)           (  3%)
                    Viridiplantae          29271 ( 18%)           (  6%)
                    Fungi                  26184 ( 16%)           (  5%)
                    Insecta                 8032 (  5%)           (  2%)
                    Nematoda                4086 (  3%)           (  1%)
                    Other                  12833 (  8%)           (  2%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8502             1001-1100     3528
                    51- 100   39984             1101-1200     2450
                    101- 150   55879             1201-1300     1930
                    151- 200   55992             1301-1400     1788
                    201- 250   54712             1401-1500     1423
                    251- 300   48066             1501-1600      636
                    301- 350   48504             1601-1700      507
                    351- 400   41489             1701-1800      422
                    401- 450   33942             1801-1900      394
                    451- 500   27298             1901-2000      322
                    501- 550   19372             2001-2100      199
                    551- 600   13782             2101-2200      265
                    601- 650   11536             2201-2300      276
                    651- 700    8232             2301-2400      168
                    701- 750    6847             2401-2500      129
                    751- 800    4889             >2500         1016
                    801- 850    4224
                    851- 900    4801
                    901- 950    3650
                    951-1000    2553
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 352 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2076
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  664
                    2x:  287
                    3x:  146
                    4x:  104
                    5x:   88
                    6x:   63
                    7x:   33
                    8x:   39
                    9x:   35
                    10x:   26
                    11- 20x:  168
                    21- 50x:  165
                    51-100x:   98
                    >100x:  160
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18017   Journal of Biological Chemistry
                    2         8344   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5033   Journal of Bacteriology
                    4         4530   Biochemical and Biophysical Research Communications
                    5         4503   Gene
                    6         4306   Nucleic Acids Research
                    7         3972   FEBS Letters
                    8         3876   Biochemistry
                    9         3755   The EMBO Journal
                    10         3424   Molecular and Cellular Biology
                    11         3236   Nature
                    12         3104   European Journal of Biochemistry
                    13         3062   Journal of Molecular Biology
                    14         2974   Biochimica et Biophysica Acta
                    15         2686   Cell
                    16         2479   Genomics
                    17         2181   Biochemical Journal
                    18         2136   Science
                    19         2030   Journal of Virology
                    20         1771   Molecular Microbiology
                    21         1577   Journal of Cell Biology
                    22         1498   Plant Molecular Biology
                    23         1369   Genes and Development
                    24         1348   Virology
                    25         1330   Plant Physiology
                    26         1324   Human Molecular Genetics
                    27         1318   Nature Genetics
                    28         1304   Molecular and General Genetics
                    29         1253   The American Journal of Human Genetics
                    30         1182   Oncogene
                    31         1170   Development
                    32         1161   Journal of Biochemistry
                    33         1088   Human Mutation
                    34         1020   Molecular Biology of the Cell
                    35         1012   Journal of Immunology
                    36          987   Genetics
                    37          889   Structure
                    38          882   Infection and Immunity
                    39          872   Journal of General Virology
                    40          871   The Plant Cell
                    41          825   Molecular Cell
                    42          825   Archives of Biochemistry and Biophysics
                    43          799   Blood
                    44          757   Yeast
                    45          754   Microbiology
                    46          752   The Plant Journal
                    47          735   Journal of Cell Science
                    48          728   Developmental Biology
                    49          676   Cancer Research
                    50          656   FEMS Microbiology Letters
                    51          651   Current Biology
                    52          595   Mechanisms of Development
                    53          592   Nature Structural Biology
                    54          590   Human Genetics
                    55          556   Acta Crystallographica, Section D
                    56          551   Protein Science
                    57          534   Applied and Environmental Microbiology
                    58          534   Journal of Neuroscience
                    59          524   Current Genetics
                    60          515   Toxicon
                    61          506   Neuron
                    62          503   Journal of Clinical Investigation
                    63          470   Mammalian Genome
                    64          461   American Journal of Physiology
                    65          449   Immunogenetics
                    66          447   The Journal of Experimental Medicine
                    67          441   Molecular Endocrinology
                    68          419   Molecular and Biochemical Parasitology
                    69          411   Journal of Neurochemistry
                    70          410   The Journal of Clinical Endocrinology and Metabolism
                    71          387   Endocrinology
                    72          378   Journal of Molecular Evolution
                    73          372   Proteins
                    74          367   Bioscience, Biotechnology, and Biochemistry
                    75          366   DNA and Cell Biology
                    76          357   Molecular Biology and Evolution
                    77          356   DNA Sequence
                    78          350   Journal of Medical Genetics
                    79          320   Tissue Antigens
                    80          314   Brain Research. Molecular Brain Research
                    81          306   Plant and Cell Physiology
                    82          304   Nature Cell Biology
                    83          298   Experimental Cell Research
                    84          297   Peptides
                    85          297   Comparative Biochemistry and Physiology
                    86          289   Biological Chemistry Hoppe-Seyler
                    87          282   Antimicrobial Agents and Chemotherapy
                    88          276   Journal of Investigative Dermatology
                    89          275   Cytogenetics and Cell Genetics
                    90          268   Molecular Pharmacology
                    91          258   Biology of Reproduction
                    92          248   Journal of General Microbiology
                    93          247   Genome Research
                    94          247   Developmental Cell
                    95          242   Neurology
                    96          241   Developmental Dynamics
                    97          239   RNA
                    98          232   Virus Research
                    99          215   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    100          211   Planta
                    101          206   Molecular Plant-Microbe Interactions
                    102          205   DNA Research
                    103          204   European Journal of Immunology
                    104          203   Biochimie
                    105          202   Annals of Neurology
                    106          198   Genes to Cells
                    107          194   European Journal of Human Genetics
                    108          191   Eukaryotic cell
                    109          190   Immunity
                    110          187   The New England Journal of Medicine
                    111          185   Journal of Human Genetics
                    112          179   Nature Structural and Molecular Biology
                    113          175   Molecular and Cellular Endocrinology
                    114          172   The FEBS Journal
                    115          169   Investigative Ophthalmology and Visual Science
                    116          165   Archives of Microbiology
                    117          165   The FASEB Journal
                    118          163   American Journal of Medical Genetics
                    119          163   Molecular Phylogenetics and Evolution
                    120          161   Insect Biochemistry and Molecular Biology
                    121          161   EMBO Reports
                    122          159   DNA
                    123          153   Molecular Immunology
                    124          153   Hemoglobin
                    125          152   Bioorganicheskaia Khimiia
                    126          151   Molecular Reproduction and Development
                    127          150   Diabetes
                    128          147   Archives of Virology
                    129          145   Glycobiology
                    130          144   Clinical Genetics
                    131          138   International Journal of Cancer
                    132          137   Molecular Genetics and Metabolism
                    133          136   General and Comparative Endocrinology
                    134          136   Animal Genetics
                    135          135   Molecular and Cellular Neuroscience
                    136          133   Journal of Cellular Biochemistry
                    137          132   Journal of the American Chemical Society
                    138          130   British Journal of Haematology
                    139          129   Biological Chemistry
                    140          126   BMC Genomics
                    141          126   American Journal of Medical Genetics. Part A
                    142          125   Molecular Genetics and Genomics
                    143          125   Nature Immunology
                    144          122   Journal of Lipid Research
                    145          122   Agricultural and Biological Chemistry
                    146          118   Circulation Research
                    147          116   Proteomics
                    148          115   Neuroscience Letters
                    149          114   Thrombosis and Haemostasis
                    150          114   Journal of Medicinal Chemistry
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       928298                 1.79                                         
                    Journal                            735916     389078      1.42       1                                 
                    Submitted to EMBL/GenBank/DDBJ     179701     166254      0.35       2                                 
                    Submitted to other databases        10610       9198      0.02       3                                 
                    Book citation                         639        625     <0.01       4                                 
                    Plant Gene Register                   560        548     <0.01       5                                 
                    Thesis                                399        396     <0.01       6                                 
                    Unpublished observations              297        293     <0.01       7                                 
                    Patent                                170        168     <0.01       8                                 
                    Worm Breeder's Gazette                  6          6     <0.01       9                                 
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 290543
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2228234                 4.30                                         
                    ALLERGEN                              465        465     <0.01      26                                 
                    ALTERNATIVE PRODUCTS                18906      18906      0.04      13                                 
                    BIOPHYSICOCHEMICAL PROPERTIES        3147       3147      0.01      22                                 
                    BIOTECHNOLOGY                         275        273     <0.01      28                                 
                    CATALYTIC ACTIVITY                 224889     205307      0.43       4                                 
                    CAUTION                              7067       6925      0.01      19                                 
                    COFACTOR                           100065      91843      0.19       7                                 
                    DEVELOPMENTAL STAGE                  8909       8909      0.02      16                                 
                    DISEASE                              4276       2892      0.01      21                                 
                    DISRUPTION PHENOTYPE                 2776       2776      0.01      23                                 
                    DOMAIN                              31986      28288      0.06      11                                 
                    ENZYME REGULATION                    8901       8901      0.02      17                                 
                    FUNCTION                           387606     371523      0.75       2                                 
                    INDUCTION                           11923      11923      0.02      15                                 
                    INTERACTION                         12576      12576      0.02      14                                 
                    MASS SPECTROMETRY                    4431       3353      0.01      20                                 
                    MISCELLANEOUS                       30349      28073      0.06      12                                 
                    PATHWAY                            127008     116121      0.24       6                                 
                    PHARMACEUTICAL                         84         84     <0.01      29                                 
                    POLYMORPHISM                          791        756     <0.01      24                                 
                    PTM                                 35866      29025      0.07       9                                 
                    RNA EDITING                           611        611     <0.01      25                                 
                    SEQUENCE CAUTION                    37929      37929      0.07       8                                 
                    SIMILARITY                         603793     493758      1.16       1                                 
                    SUBCELLULAR LOCATION               299758     294677      0.58       3                                 
                    SUBUNIT                            221852     221852      0.43       5                                 
                    TISSUE SPECIFICITY                  33268      33268      0.06      10                                 
                    TOXIC DOSE                            426        415     <0.01      27                                 
                    WEB RESOURCE                         8301       6583      0.02      18                                 
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3247931                 6.27                                         
                    ACT_SITE                           129656      77502      0.25       9                                 
                    BINDING                            206089      58052      0.40       4                                 
                    CA_BIND                              3668       1491      0.01      35                                 
                    CARBOHYD                           100409      25469      0.19      13                                 
                    CHAIN                              524820     513530      1.01       1                                 
                    COILED                              18295      12389      0.04      26                                 
                    COMPBIAS                            49712      25923      0.10      18                                 
                    CONFLICT                           117011      41056      0.23      10                                 
                    CROSSLNK                             4944       3178      0.01      34                                 
                    DISULFID                            96416      25776      0.19      14                                 
                    DNA_BIND                            10944      10076      0.02      29                                 
                    DOMAIN                             144577      86186      0.28       6                                 
                    HELIX                              130424      13641      0.25       8                                 
                    INIT_MET                            14833      14833      0.03      27                                 
                    INTRAMEM                             1523        720     <0.01      37                                 
                    LIPID                               10573       6735      0.02      31                                 
                    METAL                              276005      68087      0.53       3                                 
                    MOD_RES                            180341      59700      0.35       5                                 
                    MOTIF                               32228      20774      0.06      22                                 
                    MUTAGEN                             31367       7466      0.06      23                                 
                    NON_CONS                             1826        687     <0.01      36                                 
                    NON_STD                               349        274     <0.01      39                                 
                    NON_TER                             11797       8974      0.02      28                                 
                    NP_BIND                            104522      68055      0.20      12                                 
                    PEPTIDE                              8693       5603      0.02      32                                 
                    PROPEP                              10573       8923      0.02      30                                 
                    REGION                              92790      50893      0.18      15                                 
                    REPEAT                              88842      13125      0.17      16                                 
                    SIGNAL                              34667      34657      0.07      21                                 
                    SITE                                37323      22145      0.07      20                                 
                    STRAND                             130956      12754      0.25       7                                 
                    TOPO_DOM                           116689      23932      0.23      11                                 
                    TRANSIT                              6578       6492      0.01      33                                 
                    TRANSMEM                           338874      69223      0.65       2                                 
                    TURN                                31113      10773      0.06      24                                 
                    UNSURE                               1201        395     <0.01      38                                 
                    VAR_SEQ                             39211      16833      0.08      19                                 
                    VARIANT                             79751      16503      0.15      17                                 
                    ZN_FING                             28341      12326      0.05      25                                 
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               13275589                25.61                                                           
                    2DBase-Ecoli                           85         85     <0.01     118      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     115      2D gel databases                             
                    AGD                                   827        821     <0.01      94      Organism-specific databases                  
                    ANU-2DPAGE                             23         23     <0.01     124      2D gel databases                             
                    ArachnoServer                         460        456     <0.01     102      Organism-specific databases                  
                    ArrayExpress                        58185      58185      0.11      39      Gene expression databases                    
                    Bgee                                39443      39441      0.08      45      Gene expression databases                    
                    BindingDB                             297        297     <0.01     110      Other                                        
                    BioCyc                             160692     147808      0.31      21      Enzyme and pathway databases                 
                    BRENDA                              65183      62384      0.13      36      Enzyme and pathway databases                 
                    CAZy                                 7172       6435      0.01      68      Protein family/group databases               
                    CGD                                   561        555     <0.01      99      Organism-specific databases                  
                    CleanEx                             30187      29542      0.06      47      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     117      2D gel databases                             
                    ConoServer                            613        587     <0.01      98      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     119      2D gel databases                             
                    CTD                                 64784      64232      0.12      37      Organism-specific databases                  
                    CYGD                                 6629       6540      0.01      71      Organism-specific databases                  
                    dictyBase                            4372       4244      0.01      80      Organism-specific databases                  
                    DIP                                 11486      11382      0.02      62      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     105      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     114      2D gel databases                             
                    DrugBank                             5317       1626      0.01      73      Other                                        
                    EchoBASE                             4167       4163      0.01      82      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     109      2D gel databases                             
                    EcoGene                              4396       4394      0.01      78      Organism-specific databases                  
                    eggNOG                             217074     217074      0.42      18      Phylogenomic databases                       
                    EMBL                               856728     508471      1.65       3      Sequence databases                           
                    Ensembl                             74434      57636      0.14      31      Genome annotation databases                  
                    EnsemblBacteria                     97262      84186      0.19      27      Genome annotation databases                  
                    EnsemblFungi                        14449      14295      0.03      58      Genome annotation databases                  
                    EnsemblMetazoa                      12533       8313      0.02      60      Genome annotation databases                  
                    EnsemblPlants                       12780      11345      0.02      59      Genome annotation databases                  
                    EnsemblProtists                      4282       4160      0.01      81      Genome annotation databases                  
                    euHCVdb                                55         44     <0.01     120      Organism-specific databases                  
                    EuPathDB                              233        233     <0.01     113      Organism-specific databases                  
                    FlyBase                              5647       5271      0.01      72      Organism-specific databases                  
                    Gene3D                             236550     194456      0.46      17      Family and domain databases                  
                    GeneCards                           20589      19824      0.04      51      Organism-specific databases                  
                    GeneDB_Spombe                        4977       4932      0.01      75      Organism-specific databases                  
                    GeneFarm                             2697       2682      0.01      86      Organism-specific databases                  
                    GeneID                             476676     449760      0.92       6      Genome annotation databases                  
                    Genevestigator                      64649      64649      0.12      38      Gene expression databases                    
                    GenoList                             7039       7027      0.01      69      Organism-specific databases                  
                    GenomeReviews                      377388     357322      0.73       9      Genome annotation databases                  
                    GermOnline                          41903      41309      0.08      44      Gene expression databases                    
                    GlycoSuiteDB                          280        280     <0.01     111      PTM databases                                
                    GO                                2162252     484492      4.17       1      Ontologies                                   
                    Gramene                              4385       4385      0.01      79      Organism-specific databases                  
                    H-InvDB                             11797      11057      0.02      61      Organism-specific databases                  
                    HAMAP                              307434     307290      0.59      14      Family and domain databases                  
                    HGNC                                19710      19529      0.04      53      Organism-specific databases                  
                    HOGENOM                            360232     360232      0.69      11      Phylogenomic databases                       
                    HOVERGEN                            74431      74431      0.14      32      Phylogenomic databases                       
                    HPA                                 11304       8342      0.02      63      Organism-specific databases                  
                    HSSP                                29099      29099      0.06      48      3D structure databases                       
                    InParanoid                          66267      66267      0.13      35      Phylogenomic databases                       
                    IntAct                              22181      22180      0.04      50      Protein-protein interaction databases        
                    InterPro                          1625443     493263      3.14       2      Family and domain databases                  
                    IPI                                 89002      63795      0.17      29      Sequence databases                           
                    KEGG                               440485     418771      0.85       8      Genome annotation databases                  
                    LegioList                             760        758     <0.01      95      Organism-specific databases                  
                    Leproma                               665        662     <0.01      97      Organism-specific databases                  
                    MaizeGDB                              472        467     <0.01     101      Organism-specific databases                  
                    MEROPS                               9931       9619      0.02      64      Protein family/group databases               
                    MGI                                 16187      16138      0.03      56      Organism-specific databases                  
                    MIM                                 16334      12826      0.03      55      Organism-specific databases                  
                    MINT                                17490      17490      0.03      54      Protein-protein interaction databases        
                    NextBio                             48795      48794      0.09      42      Other                                        
                    NMPDR                              130411     130407      0.25      24      Genome annotation databases                  
                    OGP                                   377        377     <0.01     107      2D gel databases                             
                    OMA                                368433     368433      0.71      10      Phylogenomic databases                       
                    Orphanet                             3810       2185      0.01      84      Organism-specific databases                  
                    OrthoDB                             56359      56359      0.11      40      Phylogenomic databases                       
                    PANTHER                            185411     170187      0.36      20      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      76      Enzyme and pathway databases                 
                    PDB                                 68513      15886      0.13      33      3D structure databases                       
                    PDBsum                              68511      15885      0.13      34      3D structure databases                       
                    PeptideAtlas                         5169       5169      0.01      74      Proteomic databases                          
                    PeroxiBase                            678        666     <0.01      96      Protein family/group databases               
                    Pfam                               684756     480642      1.32       4      Family and domain databases                  
                    PharmGKB                            15791      15780      0.03      57      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     112      2D gel databases                             
                    PhosphoSite                         20425      20425      0.04      52      PTM databases                                
                    PhosSite                              352        352     <0.01     108      PTM databases                                
                    PhylomeDB                          121668     121668      0.23      25      Phylogenomic databases                       
                    PIR                                115415     105426      0.22      26      Sequence databases                           
                    PIRSF                               84174      84174      0.16      30      Family and domain databases                  
                    PMAP-CutDB                           1395       1395     <0.01      89      Other                                        
                    PMMA-2DPAGE                            52         52     <0.01     121      2D gel databases                             
                    PptaseDB                               34         34     <0.01     122      Protein family/group databases               
                    PRIDE                               53898      53898      0.10      41      Proteomic databases                          
                    PRINTS                             135570     117364      0.26      23      Family and domain databases                  
                    ProDom                              27827      27498      0.05      49      Family and domain databases                  
                    ProMEX                                447        447     <0.01     103      Proteomic databases                          
                    PROSITE                            459849     292704      0.89       7      Family and domain databases                  
                    ProtClustDB                        324375     324375      0.63      13      Phylogenomic databases                       
                    PseudoCAP                            1221       1212     <0.01      91      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     123      2D gel databases                             
                    Reactome                             7806       4424      0.02      66      Enzyme and pathway databases                 
                    REBASE                                378        358     <0.01     106      Protein family/group databases               
                    RefSeq                             498693     450275      0.96       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1252       1031     <0.01      90      2D gel databases                             
                    RGD                                  7416       7412      0.01      67      Organism-specific databases                  
                    SGD                                  6641       6556      0.01      70      Organism-specific databases                  
                    Siena-2DPAGE                          103        103     <0.01     116      2D gel databases                             
                    SMART                              142252     109737      0.27      22      Family and domain databases                  
                    SMR                                346901     346901      0.67      12      3D structure databases                       
                    STRING                             203856     203813      0.39      19      Protein-protein interaction databases        
                    SUPFAM                             303394     242497      0.59      15      Family and domain databases                  
                    SWISS-2DPAGE                         1183       1181     <0.01      92      2D gel databases                             
                    TAIR                                 9242       9131      0.02      65      Organism-specific databases                  
                    TCDB                                 3365       3324      0.01      85      Protein family/group databases               
                    TIGR                                34039      33272      0.07      46      Genome annotation databases                  
                    TIGRFAMs                           282974     263200      0.55      16      Family and domain databases                  
                    TubercuList                          1690       1654     <0.01      88      Organism-specific databases                  
                    UCD-2DPAGE                            512        502     <0.01     100      2D gel databases                             
                    UCSC                                48546      39568      0.09      43      Genome annotation databases                  
                    UniGene                             94075      82817      0.18      28      Sequence databases                           
                    VectorBase                            431        417     <0.01     104      Genome annotation databases                  
                    World-2DPAGE                          915        904     <0.01      93      2D gel databases                             
                    WormBase                             4521       3736      0.01      77      Organism-specific databases                  
                    Xenbase                              4107       4034      0.01      83      Organism-specific databases                  
                    ZFIN                                 2582       2571     <0.01      87      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 124
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.27   Gln (Q) 3.94   Leu (L) 9.67   Ser (S) 6.51
                    Arg (R) 5.53   Glu (E) 6.76   Lys (K) 5.85   Thr (T) 5.33
                    Asn (N) 4.05   Gly (G) 7.09   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.45   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.99   Pro (P) 4.69   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4447 entries are encoded on a mitochondrion, and 3573 are encoded on a plasmid.
                    
                    12175 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11617 on chloroplasts, 
                    44 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 69189