Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.
                    UniProtKB/Swiss-Prot protein knowledgebase release 2011_08 statistics
                    
                    
                    1.  INTRODUCTION
                    
                    Release 2011_08 of 27-Jul-11 of UniProtKB/Swiss-Prot contains 531473 sequence entries,
                    comprising 188463640 amino acids abstracted from 200346 references. 
                    
                    1647 sequences have been added since release 2011_07, the sequence data of
                    1718 existing entries has been updated and the annotations of
                    200839 entries have been revised.
                    
                    Number of fragments: 8900
                    Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 30558
                    
                    
                    Protein existence (PE):           entries     %
                    
                    1: Evidence at protein level        73667   13.9%
                    2: Evidence at transcript level     68884     13%
                    3: Inferred from homology          372619   70.1%
                    4: Predicted                        14435    2.7%
                    5: Uncertain                         1868    0.4%
                    
                    The growth of the database is summarized below.
                    
                    
                    
                    
                    2.  TAXONOMIC ORIGIN
                    
                    Total number of species represented in this release of UniProtKB/Swiss-Prot: 12564
                    
                    The first twenty species represent 110511 sequences:  20.8 % of the total
                    number of entries.
                    
                    
                    2.1 Table of the frequency of occurrence of species
                    
                    Species represented 1x: 5313
                    2x: 1828
                    3x:  950
                    4x:  619
                    5x:  453
                    6x:  361
                    7x:  258
                    8x:  217
                    9x:  197
                    10x:  109
                    11- 20x:  635
                    21- 50x:  388
                    51-100x:  201
                    >100x: 1035
                    
                    
                    2.2  Table of the most represented species
                    
                    ------  ---------  --------------------------------------------
                    Number  Frequency  Species
                    ------  ---------  --------------------------------------------
                    1      20244  Homo sapiens (Human)
                    2      16384  Mus musculus (Mouse)
                    3      10617  Arabidopsis thaliana (Mouse-ear cress)
                    4       7631  Rattus norvegicus (Rat)
                    5       6620  Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)
                    6       5857  Bos taurus (Bovine)
                    7       4976  Schizosaccharomyces pombe (strain ATCC 38366 / 972) (Fission yeast)
                    8       4430  Escherichia coli (strain K12)
                    9       4244  Bacillus subtilis
                    10       4129  Dictyostelium discoideum (Slime mold)
                    11       3334  Caenorhabditis elegans
                    12       3319  Xenopus laevis (African clawed frog)
                    13       3124  Drosophila melanogaster (Fruit fly)
                    14       2771  Oryza sativa subsp. japonica (Rice)
                    15       2743  Danio rerio (Zebrafish) (Brachydanio rerio)
                    16       2214  Pongo abelii (Sumatran orangutan)
                    17       2212  Gallus gallus (Chicken)
                    18       2000  Escherichia coli O157:H7
                    19       1875  Mycobacterium tuberculosis
                    20       1787  Methanocaldococcus jannaschii  
                    21       1784  Salmonella typhimurium
                    22       1707  Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
                    23       1677  Shigella flexneri
                    24       1674  Escherichia coli O6
                    25       1608  Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
                    26       1396  Sus scrofa (Pig)
                    27       1344  Salmonella typhi
                    28       1285  Pseudomonas aeruginosa
                    29       1243  Mycobacterium bovis
                    30       1167  Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
                    31       1025  Synechocystis sp. (strain ATCC 27184 / PCC 6803 / N-1)
                    32       1001  Yersinia pestis
                    33        999  Archaeoglobus fulgidus 
                    34        949  Vibrio cholerae
                    35        929  Salmonella paratyphi A
                    36        924  Staphylococcus aureus (strain N315)
                    37        923  Staphylococcus aureus (strain Mu50 / ATCC 700699)
                    38        920  Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056)  
                    39        913  Rhizobium meliloti (Ensifer meliloti) (Sinorhizobium meliloti)
                    40        909  Acanthamoeba polyphaga mimivirus (APMV)
                    41        899  Kluyveromyces lactis   
                    42        898  Staphylococcus aureus (strain COL)
                    43        895  Staphylococcus aureus (strain MW2)
                    44        889  Staphylococcus aureus (strain MSSA476)
                    45        887  Staphylococcus aureus (strain MRSA252)
                    46        886  Escherichia coli O6:K15:H31 (strain 536 / UPEC)
                    47        884  Oryctolagus cuniculus (Rabbit)
                    48        881  Salmonella choleraesuis
                    49        877  Shigella sonnei (strain Ss046)
                    50        864  Yersinia pseudotuberculosis
                    51        858  Candida glabrata   
                    52        848  Candida albicans (Yeast)
                    53        841  Escherichia coli O9:H4 (strain HS)
                    54        834  Escherichia coli O139:H28 (strain E24377A / ETEC)
                    55        826  Shigella boydii serotype 4 (strain Sb227)
                    56        823  Escherichia coli (strain UTI89 / UPEC)
                    57        819  Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
                    58        809  Neurospora crassa 
                    59        809  Shigella dysenteriae serotype 1 (strain Sd197)
                    60        795  Vibrio parahaemolyticus
                    61        791  Canis familiaris (Dog) (Canis lupus familiaris)
                    62        790  Escherichia coli (strain SMS-3-5 / SECEC)
                    63        779  Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
                    64        774  Aquifex aeolicus
                    65        772  Pasteurella multocida (strain Pm70)
                    66        770  Escherichia coli (strain K12 / DH10B)
                    67        764  Escherichia coli O127:H6 (strain E2348/69 / EPEC)
                    68        764  Escherichia coli (strain K12 / MC4100 / BW2952)
                    69        762  Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC)
                    70        762  Escherichia coli (strain 55989 / EAEC)
                    71        761  Escherichia coli O8 (strain IAI1)
                    72        759  Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
                    73        758  Emericella nidulans (Aspergillus nidulans)
                    74        758  Shigella flexneri serotype 5b (strain 8401)
                    75        756  Escherichia coli (strain SE11)
                    76        756  Staphylococcus epidermidis (strain ATCC 12228)
                    77        756  Escherichia coli O45:K1 (strain S88 / ExPEC)
                    78        755  Streptomyces coelicolor
                    79        753  Escherichia coli O7:K1 (strain IAI39 / ExPEC)
                    80        746  Escherichia coli O157:H7 (strain EC4115 / EHEC)
                    81        742  Photorhabdus luminescens subsp. laumondii (strain TT01)
                    82        733  Bacillus halodurans
                    83        733  Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
                    84        731  Escherichia coli O81 (strain ED1a)
                    85        731  Vibrio vulnificus
                    86        726  Bacillus anthracis
                    87        720  Salmonella enteritidis PT4 (strain P125109)
                    88        717  Staphylococcus aureus (strain NCTC 8325)
                    89        716  Vibrio vulnificus (strain YJ016)
                    90        715  Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
                    91        713  Yersinia pestis bv. Antiqua (strain Nepal516)
                    92        713  Salmonella paratyphi A (strain AKU_12601)
                    93        713  Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
                    94        712  Salmonella agona (strain SL483)
                    95        712  Salmonella newport (strain SL254)
                    96        711  Escherichia coli O1:K1 / APEC
                    97        710  Salmonella heidelberg (strain SL476)
                    98        709  Yersinia pestis bv. Antiqua (strain Antiqua)
                    99        709  Salmonella schwarzengrund (strain CVM19633)
                    100        708  Enterobacter sp. (strain 638)
                    101        706  Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
                    102        701  Salmonella dublin (strain CT_02021853)
                    103        698  Shigella boydii serotype 18 (strain CDC 3083-94 / BS512)
                    104        693  Klebsiella pneumoniae (strain 342)
                    105        691  Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73)
                    106        687  Mycoplasma pneumoniae
                    107        686  Pan troglodytes (Chimpanzee)
                    108        686  Nostoc sp. (strain PCC 7120 / UTEX 2576)
                    109        684  Pseudomonas syringae pv. tomato
                    110        683  Salmonella gallinarum (strain 287/91 / NCTC 13346)
                    111        680  Zea mays (Maize)
                    112        674  Pseudomonas putida (strain KT2440)
                    113        673  Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
                    114        668  Mycobacterium leprae
                    115        666  Staphylococcus aureus (strain USA300)
                    116        666  Yersinia pestis (strain Pestoides F)
                    117        664  Serratia proteamaculans (strain 568)
                    118        658  Rhizobium sp. (strain NGR234)
                    119        650  Bradyrhizobium japonicum
                    120        648  Debaryomyces hansenii   
                    121        642  Bacillus cereus (strain ATCC 14579 / DSM 31)
                    122        642  Staphylococcus aureus (strain bovine RF122 / ET3-1)
                    123        640  Escherichia coli
                    124        637  Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
                    125        637  Yersinia pseudotuberculosis serotype O:3 (strain YPIII)
                    126        633  Yersinia pseudotuberculosis serotype IB (strain PB1/+)
                    127        631  Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica)
                    128        626  Agrobacterium tumefaciens (strain C58 / ATCC 33970)
                    129        621  Shewanella oneidensis
                    130        619  Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) 
                    131        615  Treponema pallidum (strain Nichols)
                    132        615  Enterobacter sakazakii (strain ATCC BAA-894)
                    133        614  Ralstonia solanacearum (Pseudomonas solanacearum)
                    134        611  Staphylococcus haemolyticus (strain JCSC1435)
                    135        606  Methanobacterium thermoautotrophicum (strain Delta H)
                    136        605  Rhizobium loti (Mesorhizobium loti)
                    137        602  Staphylococcus saprophyticus subsp. saprophyticus 
                    138        600  Yersinia pestis bv. Antiqua (strain Angola)
                    139        600  Salmonella paratyphi C (strain RKS4594)
                    140        599  Listeria monocytogenes
                    141        598  Photobacterium profundum (Photobacterium sp. (strain SS9))
                    142        590  Bacillus cereus (strain ATCC 10987)
                    143        590  Xanthomonas campestris pv. campestris
                    144        589  Listeria innocua
                    145        586  Pectobacterium carotovorum subsp. carotovorum (strain PC1)
                    146        585  Rickettsia prowazekii (strain Madrid E)
                    147        585  Helicobacter pylori (Campylobacter pylori)
                    148        581  Lactococcus lactis subsp. lactis (Streptococcus lactis)
                    149        579  Neisseria meningitidis serogroup B
                    150        576  Brucella suis
                    151        572  Brucella melitensis
                    152        572  Buchnera aphidicola subsp. Acyrthosiphon pisum 
                    153        572  Oryza sativa subsp. indica (Rice)
                    154        569  Bacillus thuringiensis subsp. konkukian
                    155        565  Helicobacter pylori J99 (Campylobacter pylori J99)
                    156        565  Pseudomonas syringae pv. syringae (strain B728a)
                    157        562  Bacillus licheniformis (strain DSM 13 / ATCC 14580)
                    158        562  Buchnera aphidicola subsp. Schizaphis graminum
                    159        561  Caulobacter crescentus (Caulobacter vibrioides)
                    160        560  Bacillus cereus (strain ZK / E33L)
                    161        558  Pseudomonas aeruginosa (strain UCBPP-PA14)
                    162        556  Neisseria meningitidis serogroup A
                    163        556  Clostridium acetobutylicum
                    164        556  Xanthomonas axonopodis pv. citri (Citrus canker)
                    165        554  Vibrio fischeri (strain ATCC 700601 / ES114)
                    166        552  Pseudomonas fluorescens (strain Pf0-1)
                    167        552  Caenorhabditis briggsae
                    168        551  Oceanobacillus iheyensis (strain DSM 14371 / JCM 11309 / KCTC 3954 / HTE831)
                    169        545  Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
                    170        543  Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
                    171        529  Listeria monocytogenes serotype 4b (strain F2365)
                    172        526  Sodalis glossinidius (strain morsitans)
                    173        526  Erwinia tasmaniensis (strain DSM 17950 / Et1/99)
                    174        525  Streptococcus pneumoniae
                    175        522  Xylella fastidiosa
                    176        521  Bordetella bronchiseptica (Alcaligenes bronchisepticus)
                    177        517  Thermotoga maritima
                    178        513  Chromobacterium violaceum
                    179        512  Xylella fastidiosa (strain Temecula1 / ATCC 700964)
                    180        511  Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
                    181        508  Pseudomonas aeruginosa (strain PA7)
                    182        507  Bordetella parapertussis
                    183        507  Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp)
                    184        507  Haemophilus ducreyi
                    185        507  Bordetella pertussis
                    186        507  Geobacillus kaustophilus
                    187        505  Staphylococcus aureus (strain Newman)
                    188        500  Pseudomonas entomophila (strain L48)
                    189        499  Deinococcus radiodurans
                    190        498  Brucella abortus
                    191        497  Rickettsia conorii (strain ATCC VR-613 / Malish 7)
                    192        496  Bacillus clausii (strain KSM-K16)
                    193        494  Haemophilus influenzae (strain 86-028NP)
                    194        493  Corynebacterium glutamicum (Brevibacterium flavum)
                    195        493  Streptomyces avermitilis
                    196        491  Bacillus amyloliquefaciens (strain FZB42)
                    197        491  Xanthomonas campestris pv. campestris (strain 8004)
                    198        490  Vibrio harveyi (strain ATCC BAA-1116 / BB120)
                    199        490  Clostridium perfringens
                    200        487  Burkholderia pseudomallei (Pseudomonas pseudomallei)
                    201        487  Shewanella sp. (strain MR-7)
                    202        484  Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A)
                    203        484  Mannheimia succiniciproducens (strain MBEL55E)
                    204        484  Pseudomonas aeruginosa (strain LESB58)
                    205        484  Staphylococcus aureus (strain Mu3 / ATCC 700698)
                    206        484  Shewanella sp. (strain MR-4)
                    207        483  Proteus mirabilis (strain HI4320)
                    208        483  Mycoplasma genitalium
                    209        476  Thermosynechococcus elongatus (strain BP-1)
                    210        475  Acinetobacter sp. (strain ADP1)
                    211        475  Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
                    212        474  Burkholderia sp. (strain 383) (Burkholderia cepacia 
                    213        474  Pyrococcus horikoshii 
                    214        474  Enterococcus faecalis (Streptococcus faecalis)
                    215        473  Pseudomonas putida (strain F1 / ATCC 700007)
                    216        473  Brucella abortus (strain 2308)
                    217        468  Aspergillus oryzae (strain ATCC 42149 / RIB 40)
                    218        467  Rhodopseudomonas palustris
                    219        466  Xanthomonas campestris pv. vesicatoria (strain 85-10)
                    220        465  Pyrococcus abyssi (strain GE5 / Orsay)
                    221        465  Pseudomonas putida (strain GB-1)
                    222        464  Lactobacillus plantarum
                    223        464  Shewanella frigidimarina (strain NCIMB 400)
                    224        463  Shewanella sp. (strain ANA-3)
                    225        463  Methanosarcina mazei  
                    226        462  Anabaena variabilis (strain ATCC 29413 / PCC 7937)
                    227        462  Halobacterium salinarium (strain ATCC 700922 / JCM 11081 / NRC-1) 
                    228        461  Burkholderia mallei (Pseudomonas mallei)
                    229        461  Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337) 
                    230        460  Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
                    231        459  Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
                    232        458  Cupriavidus pinatubonensis (strain JMP134 / LMG 1197) (Alcaligenes eutrophus) 
                    233        455  Staphylococcus aureus (strain JH1)
                    234        454  Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
                    235        453  Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi)
                    236        452  Ovis aries (Sheep)
                    237        452  Shewanella baltica (strain OS185)
                    238        452  Pseudomonas putida (strain W619)
                    239        450  Methylococcus capsulatus
                    240        450  Streptococcus mutans
                    241        449  Staphylococcus aureus (strain JH9)
                    242        449  Aeromonas salmonicida (strain A449)
                    243        448  Thermoanaerobacter tengcongensis
                    244        448  Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
                    245        447  Mycobacterium paratuberculosis
                    246        447  Vibrio fischeri (strain MJ11)
                    247        446  Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
                    248        444  Nicotiana tabacum (Common tobacco)
                    249        444  Hahella chejuensis (strain KCTC 2396)
                    250        444  Pseudomonas mendocina (strain ymp)
                    
                    
                    
                    2.3  Taxonomic distribution of the sequences
                    
                    
                    
                    Kingdom        sequences (% of the database)
                    Archaea           18658 (  4%)
                    Bacteria         326570 ( 61%)
                    Eukaryota        170278 ( 32%)
                    Viruses           15967 (  3%)
                    
                    
                    Within Eukaryota:
                    
                    
                    
                    Category            sequences (% of Eukaryota) (% of the complete database)
                    Human                  20245 ( 12%)           (  4%)
                    Other Mammalia         45206 ( 27%)           (  9%)
                    Other Vertebrata       16720 ( 10%)           (  3%)
                    Viridiplantae          31498 ( 18%)           (  6%)
                    Fungi                  30164 ( 18%)           (  6%)
                    Insecta                 8273 (  5%)           (  2%)
                    Nematoda                4189 (  2%)           (  1%)
                    Other                  13983 (  8%)           (  3%)
                    
                    
                    
                    3.  SEQUENCE SIZE
                    
                    Repartition of the sequences by size (excluding fragments)
                    
                    From   To  Number             From   To   Number
                    1-  50    8655             1001-1100     3653
                    51- 100   40806             1101-1200     2548
                    101- 150   56967             1201-1300     1987
                    151- 200   56910             1301-1400     1840
                    201- 250   55844             1401-1500     1488
                    251- 300   49032             1501-1600      720
                    301- 350   49388             1601-1700      536
                    351- 400   42668             1701-1800      447
                    401- 450   35016             1801-1900      413
                    451- 500   28117             1901-2000      335
                    501- 550   19888             2001-2100      206
                    551- 600   14288             2101-2200      275
                    601- 650   12074             2201-2300      284
                    651- 700    8701             2301-2400      168
                    701- 750    7169             2401-2500      136
                    751- 800    5061             >2500         1058
                    801- 850    4453
                    851- 900    4955
                    901- 950    3801
                    951-1000    2686
                    
                    
                    
                    
                    The average sequence length in UniProtKB/Swiss-Prot is 354 amino acids.
                    
                    The shortest sequence is   GWA_SEPOF (P83570):     2 amino acids.
                    The longest sequence is  TITIN_MOUSE (A2ASS6): 35213 amino acids.
                    
                    
                    4.  JOURNAL CITATIONS
                    
                    Note: the following citation statistics reflect the number of distinct
                    journal citations.
                    
                    Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2154
                    
                    
                    4.1 Table of the frequency of journal citations
                    
                    Journals cited 1x:  703
                    2x:  283
                    3x:  146
                    4x:  108
                    5x:   92
                    6x:   72
                    7x:   34
                    8x:   36
                    9x:   33
                    10x:   31
                    11- 20x:  166
                    21- 50x:  183
                    51-100x:   98
                    >100x:  169
                    
                    
                    4.2  List of the most cited journals in UniProtKB/Swiss-Prot
                    
                    Nb    Citations   Journal name
                    --    ---------   -------------------------------------------------------------
                    1        18922   Journal of Biological Chemistry
                    2         8774   Proceedings of the National Academy of Sciences of the U.S.A.
                    3         5228   Journal of Bacteriology
                    4         4712   Biochemical and Biophysical Research Communications
                    5         4544   Gene
                    6         4388   Nucleic Acids Research
                    7         4088   Biochemistry
                    8         4088   FEBS Letters
                    9         3914   The EMBO Journal
                    10         3583   Molecular and Cellular Biology
                    11         3419   Nature
                    12         3272   Journal of Molecular Biology
                    13         3153   European Journal of Biochemistry
                    14         3054   Biochimica et Biophysica Acta
                    15         2828   Cell
                    16         2494   Genomics
                    17         2272   Journal of Virology
                    18         2254   Biochemical Journal
                    19         2242   Science
                    20         1852   Molecular Microbiology
                    21         1673   Journal of Cell Biology
                    22         1555   Plant Physiology
                    23         1549   Plant Molecular Biology
                    24         1467   Genes and Development
                    25         1463   Virology
                    26         1379   The American Journal of Human Genetics
                    27         1376   Nature Genetics
                    28         1375   Human Molecular Genetics
                    29         1317   Molecular and General Genetics
                    30         1262   Oncogene
                    31         1247   Development
                    32         1186   Journal of Biochemistry
                    33         1164   Human Mutation
                    34         1122   Molecular Biology of the Cell
                    35         1089   The Plant Cell
                    36         1050   Journal of Immunology
                    37         1039   Genetics
                    38          978   Journal of General Virology
                    39          954   Structure
                    40          944   Molecular Cell
                    41          930   The Plant Journal
                    42          903   Infection and Immunity
                    43          860   Archives of Biochemistry and Biophysics
                    44          840   Blood
                    45          810   Journal of Cell Science
                    46          785   Microbiology
                    47          775   Yeast
                    48          766   Developmental Biology
                    49          710   Cancer Research
                    50          709   Current Biology
                    51          683   FEMS Microbiology Letters
                    52          610   Nature Structural Biology
                    53          608   Mechanisms of Development
                    54          606   Human Genetics
                    55          600   Acta Crystallographica, Section D
                    56          595   Protein Science
                    57          575   Applied and Environmental Microbiology
                    58          561   Journal of Neuroscience
                    59          543   Toxicon
                    60          532   Current Genetics
                    61          531   Neuron
                    62          528   Journal of Clinical Investigation
                    63          480   American Journal of Physiology
                    64          476   Mammalian Genome
                    65          468   The Journal of Experimental Medicine
                    66          452   Immunogenetics
                    67          452   Molecular Endocrinology
                    68          426   Molecular and Biochemical Parasitology
                    69          420   Journal of Neurochemistry
                    70          419   Proteins
                    71          418   The Journal of Clinical Endocrinology and Metabolism
                    72          411   Endocrinology
                    73          391   Bioscience, Biotechnology, and Biochemistry
                    74          384   Journal of Molecular Evolution
                    75          373   Journal of Medical Genetics
                    76          370   Plant and Cell Physiology
                    77          368   DNA and Cell Biology
                    78          363   Molecular Biology and Evolution
                    79          359   DNA Sequence
                    80          349   Nature Cell Biology
                    81          325   Experimental Cell Research
                    82          321   Tissue Antigens
                    83          318   Brain Research. Molecular Brain Research
                    84          317   Peptides
                    85          305   Comparative Biochemistry and Physiology
                    86          292   Biological Chemistry Hoppe-Seyler
                    87          292   Antimicrobial Agents and Chemotherapy
                    88          288   Journal of Investigative Dermatology
                    89          279   RNA
                    90          278   Cytogenetics and Cell Genetics
                    91          277   Developmental Cell
                    92          276   Molecular Pharmacology
                    93          267   Biology of Reproduction
                    94          263   Neurology
                    95          258   Virus Research
                    96          255   Genome Research
                    97          251   Journal of General Microbiology
                    98          250   Planta
                    99          249   Developmental Dynamics
                    100          241   Molecular Plant-Microbe Interactions
                    101          236   Nature Structural and Molecular Biology
                    102          218   Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
                    103          214   Annals of Neurology
                    104          214   The FEBS Journal
                    105          214   Genes to Cells
                    106          213   Eukaryotic cell
                    107          212   European Journal of Immunology
                    108          212   PLoS ONE
                    109          211   Biochimie
                    110          211   Immunity
                    111          209   DNA Research
                    112          204   The New England Journal of Medicine
                    113          204   European Journal of Human Genetics
                    114          202   EMBO Reports
                    115          194   Journal of Human Genetics
                    116          192   The FASEB Journal
                    117          185   Archives of Virology
                    118          179   Molecular and Cellular Endocrinology
                    119          179   Investigative Ophthalmology and Visual Science
                    120          173   Archives of Microbiology
                    121          169   Insect Biochemistry and Molecular Biology
                    122          167   Molecular Immunology
                    123          167   American Journal of Medical Genetics
                    124          166   Molecular Phylogenetics and Evolution
                    125          161   Glycobiology
                    126          159   BMC Genomics
                    127          159   DNA
                    128          159   American Journal of Medical Genetics. Part A
                    129          157   Clinical Genetics
                    130          156   Diabetes
                    131          155   Molecular Reproduction and Development
                    132          153   Journal of the American Chemical Society
                    133          153   Hemoglobin
                    134          152   Bioorganicheskaia Khimiia
                    135          151   Journal of Cellular Biochemistry
                    136          149   International Journal of Cancer
                    137          144   Nature Immunology
                    138          144   Molecular and Cellular Neuroscience
                    139          142   Molecular Genetics and Genomics
                    140          142   Molecular Genetics and Metabolism
                    141          138   General and Comparative Endocrinology
                    142          138   Animal Genetics
                    143          138   Biological Chemistry
                    144          137   British Journal of Haematology
                    145          131   Journal of Medicinal Chemistry
                    146          131   Journal of Lipid Research
                    147          131   Proteomics
                    148          128   Circulation Research
                    149          126   Thrombosis and Haemostasis
                    150          125   Protein Expression and Purification
                    
                    
                    5.  STATISTICS FOR SOME LINE TYPES
                    
                    The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
                    as well as the number of entries with at least one such line, and the
                    frequency of the lines.
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry
                    ------------------------------------  -------- ---------  ---------
                    
                    References (RL)                       985729                 1.85        
                    Journal                            779699     403936      1.47       1
                    Submitted to EMBL/GenBank/DDBJ     197365     177674      0.37       2
                    Submitted to other databases         6567       6125      0.01       3
                    Book citation                         646        632     <0.01       4
                    Plant Gene Register                   569        557     <0.01       5
                    Thesis402        399     <0.01       6
                    Unpublished observations              294        290     <0.01       7
                    Patent181        178     <0.01       8
                    Worm Breeder's Gazette                  6          6     <0.01       9
                    
                    Total number of distinct authors cited in UniProtKB/Swiss-Prot: 306144
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Comments (CC)                        2323810                 4.37        
                    ALLERGEN                              506        506     <0.01      26
                    ALTERNATIVE PRODUCTS                20082      20082      0.04      13
                    BIOPHYSICOCHEMICAL PROPERTIES        3729       3729      0.01      23
                    BIOTECHNOLOGY                         317        315     <0.01      28
                    CATALYTIC ACTIVITY                 233078     212374      0.44       4
                    CAUTION                              7688       7534      0.01      19
                    COFACTOR                           102707      94384      0.19       7
                    DEVELOPMENTAL STAGE                  9329       9329      0.02      17
                    DISEASE                              4605       3110      0.01      21
                    DISRUPTION PHENOTYPE                 3857       3857      0.01      22
                    DOMAIN                              35002      30973      0.07      11
                    ENZYME REGULATION                    9663       9663      0.02      16
                    FUNCTION                           403848     387106      0.76       2
                    INDUCTION                           13278      13278      0.02      14
                    INTERACTION                         13162      13162      0.02      15
                    MASS SPECTROMETRY                    4870       3698      0.01      20
                    MISCELLANEOUS                       31312      28900      0.06      12
                    PATHWAY                            129762     118390      0.24       6
                    PHARMACEUTICAL                         85         85     <0.01      29
                    POLYMORPHISM                          831        790     <0.01      24
                    PTM38816      31080      0.07       9
                    RNA EDITING                           621        621     <0.01      25
                    SEQUENCE CAUTION                    39644      39644      0.07       8
                    SIMILARITY                         626108     506653      1.18       1
                    SUBCELLULAR LOCATION               314560     309107      0.59       3
                    SUBUNIT                            231366     231366      0.44       5
                    TISSUE SPECIFICITY                  35804      35804      0.07      10
                    TOXIC DOSE                            471        457     <0.01      27
                    WEB RESOURCE                         8709       6989      0.02      18
                    
                    Total number of comment topics: 29
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank
                    ------------------------------------  -------- ---------  ---------  ----
                    Features (FT)                        3439528                 6.47        
                    ACT_SITE                           134120      80735      0.25       9
                    BINDING                            231290      64967      0.44       4
                    CA_BIND                              3784       1564      0.01      35
                    CARBOHYD                           105281      26710      0.20      13
                    CHAIN                              538093     525848      1.01       1
                    COILED                              19443      13309      0.04      26
                    COMPBIAS                            52409      27625      0.10      18
                    CONFLICT                           122752      43069      0.23      11
                    CROSSLNK                             6094       3629      0.01      34
                    DISULFID                           101968      27480      0.19      15
                    DNA_BIND                            11227      10350      0.02      30
                    DOMAIN                             155013      92602      0.29       6
                    HELIX                              145700      15190      0.27       7
                    INIT_MET                            15093      15093      0.03      27
                    INTRAMEM                             1890        824     <0.01      38
                    LIPID                               11053       7030      0.02      31
                    METAL                              294247      72059      0.55       3
                    MOD_RES                            185631      61164      0.35       5
                    MOTIF                               34249      22055      0.06      23
                    MUTAGEN                             35987       8442      0.07      22
                    NON_CONS                             1981        733     <0.01      37
                    NON_STD                               353        278     <0.01      39
                    NON_TER                             12047       9168      0.02      29
                    NP_BIND                            112614      70703      0.21      12
                    PEPTIDE                              9567       6400      0.02      32
                    PROPEP                              12178      10445      0.02      28
                    REGION                             104916      56299      0.20      14
                    REPEAT                              92113      13626      0.17      16
                    SIGNAL                              36920      36910      0.07      21
                    SITE39850      23557      0.07      20
                    STRAND                             143281      14132      0.27       8
                    TOPO_DOM                           124935      25640      0.24      10
                    TRANSIT                              7637       7550      0.01      33
                    TRANSMEM                           348086      71583      0.65       2
                    TURN33831      11864      0.06      24
                    UNSURE                               2515        498     <0.01      36
                    VAR_SEQ                             40680      17542      0.08      19
                    VARIANT                             81657      16554      0.15      17
                    ZN_FING                             29043      12624      0.05      25
                    
                    Total number of feature keys: 39
                    
                    
                    
                    Total    Number of  Average
                    Line type / subtype                number   entries    per entry  Rank      Category
                    ------------------------------------  -------- ---------  ---------  ----      -------------------------------------------
                    Cross-references (DR)               14455270                27.20                          
                    2DBase-Ecoli                           85         85     <0.01     122      2D gel databases                             
                    Aarhus/Ghent-2DPAGE                   126         96     <0.01     119      2D gel databases                             
                    AGD  926        920     <0.01      97      Organism-specific databases                  
                    Allergome                            1350        817     <0.01      93      Protein family/group databases               
                    ANU-2DPAGE                             23         23     <0.01     128      2D gel databases                             
                    ArachnoServer                         759        755     <0.01     101      Organism-specific databases                  
                    ArrayExpress                        58710      58710      0.11      42      Gene expression databases                    
                    Bgee40093      40085      0.08      46      Gene expression databases                    
                    BindingDB                             296        296     <0.01     115      Other       
                    BioCyc                             251762     243552      0.47      20      Enzyme and pathway databases                 
                    BRENDA                               4200       4194      0.01      84      Enzyme and pathway databases                 
                    CAZy7395       6648      0.01      70      Protein family/group databases               
                    CGD  612        602     <0.01     105      Organism-specific databases                  
                    CleanEx                             30118      29475      0.06      48      Gene expression databases                    
                    COMPLUYEAST-2DPAGE                    101        100     <0.01     121      2D gel databases                             
                    ConoServer                            762        736     <0.01      99      Organism-specific databases                  
                    Cornea-2DPAGE                          67         67     <0.01     123      2D gel databases                             
                    CTD66443      65826      0.13      40      Organism-specific databases                  
                    CYGD5594       5591      0.01      74      Organism-specific databases                  
                    dictyBase                            3981       3981      0.01      86      Organism-specific databases                  
                    DIP12540      12418      0.02      64      Protein-protein interaction databases        
                    DisProt                               397        394     <0.01     111      3D structure databases                       
                    DOSAC-COBS-2DPAGE                     149        147     <0.01     118      2D gel databases                             
                    DrugBank                             5318       1627      0.01      75      Other       
                    EchoBASE                             4167       4163      0.01      85      Organism-specific databases                  
                    ECO2DBASE                             352        300     <0.01     113      2D gel databases                             
                    EcoGene                              4291       4289      0.01      83      Organism-specific databases                  
                    eggNOG                             219237     219237      0.41      21      Phylogenomic databases                       
                    EMBL                               906448     521173      1.71       3      Sequence databases                           
                    Ensembl                             70387      51633      0.13      37      Genome annotation databases                  
                    EnsemblBacteria                     97409      84537      0.18      29      Genome annotation databases                  
                    EnsemblFungi                        15123      15033      0.03      61      Genome annotation databases                  
                    EnsemblMetazoa                      11353       8725      0.02      65      Genome annotation databases                  
                    EnsemblPlants                       16253      14114      0.03      59      Genome annotation databases                  
                    EnsemblProtists                      4402       4285      0.01      82      Genome annotation databases                  
                    euHCVdb55         44     <0.01     124      Organism-specific databases                  
                    EuPathDB                              742        742     <0.01     102      Organism-specific databases                  
                    FlyBase                              5798       5424      0.01      73      Organism-specific databases                  
                    Gene3D                             324463     251486      0.61      16      Family and domain databases                  
                    GeneCards                           20229      19669      0.04      53      Organism-specific databases                  
                    GeneDB_Spombe                        4982       4940      0.01      77      Organism-specific databases                  
                    GeneFarm                             2997       2983      0.01      89      Organism-specific databases                  
                    GeneID                             480432     461330      0.90       6      Genome annotation databases                  
                    GeneTree                           168534     168491      0.32      23      Phylogenomic databases                       
                    Genevestigator                      66026      66026      0.12      41      Gene expression databases                    
                    GenoList                             7056       7044      0.01      71      Organism-specific databases                  
                    GenomeReviews                      374532     355047      0.70      10      Genome annotation databases                  
                    GermOnline                          41912      41338      0.08      45      Gene expression databases                    
                    GlycoSuiteDB                          272        272     <0.01     116      PTM databases
                    GO2121254     497872      3.99       1      Ontologies  
                    Gramene                              4662       4662      0.01      78      Organism-specific databases                  
                    H-InvDB                             13206      12309      0.02      63      Organism-specific databases                  
                    HAMAP                              310179     310023      0.58      17      Family and domain databases                  
                    HGNC19732      19572      0.04      55      Organism-specific databases                  
                    HOGENOM                            363532     363532      0.68      12      Phylogenomic databases                       
                    HOVERGEN                            74919      74919      0.14      36      Phylogenomic databases                       
                    HPA14687      11144      0.03      62      Organism-specific databases                  
                    HSSP29870      29870      0.06      49      3D structure databases                       
                    InParanoid                          67855      67855      0.13      39      Phylogenomic databases                       
                    IntAct                              25487      25487      0.05      51      Protein-protein interaction databases        
                    InterPro                          1759188     505021      3.31       2      Family and domain databases                  
                    IPI91953      65629      0.17      32      Sequence databases                           
                    KEGG                               452497     431487      0.85       8      Genome annotation databases                  
                    LegioList                             761        759     <0.01     100      Organism-specific databases                  
                    Leproma                               671        668     <0.01     104      Organism-specific databases                  
                    MaizeGDB                              479        474     <0.01     109      Organism-specific databases                  
                    MEROPS                              10702      10370      0.02      66      Protein family/group databases               
                    MGI16285      16240      0.03      58      Organism-specific databases                  
                    MIM16826      13049      0.03      57      Organism-specific databases                  
                    MINT17587      17587      0.03      56      Protein-protein interaction databases        
                    NextBio                             49168      49166      0.09      43      Other       
                    neXtProt                            20042      20040      0.04      54      Organism-specific databases                  
                    NMPDR                              132657     132646      0.25      26      Genome annotation databases                  
                    OGP  377        377     <0.01     112      2D gel databases                             
                    OMA371764     371764      0.70      11      Phylogenomic databases                       
                    Orphanet                             3938       2358      0.01      87      Organism-specific databases                  
                    OrthoDB                             77300      77244      0.15      35      Phylogenomic databases                       
                    PANTHER                            169384     160995      0.32      22      Family and domain databases                  
                    Pathway_Interaction_DB               4567       1665      0.01      81      Enzyme and pathway databases                 
                    PDB77379      17071      0.15      34      3D structure databases                       
                    PDBsum                              77379      17071      0.15      33      3D structure databases                       
                    PeptideAtlas                         5164       5164      0.01      76      Proteomic databases                          
                    PeroxiBase                            738        727     <0.01     103      Protein family/group databases               
                    Pfam                               700476     490170      1.32       4      Family and domain databases                  
                    PharmGKB                            15409      15103      0.03      60      Organism-specific databases                  
                    PHCI-2DPAGE                           247        247     <0.01     117      2D gel databases                             
                    PhosphoSite                         23771      23771      0.04      52      PTM databases
                    PhosSite                              351        351     <0.01     114      PTM databases
                    PhylomeDB                          123131     123131      0.23      27      Phylogenomic databases                       
                    PIR117059     107034      0.22      28      Sequence databases                           
                    PIRSF                               93263      93260      0.18      31      Family and domain databases                  
                    PMAP-CutDB                           1399       1399     <0.01      92      Other       
                    PMMA-2DPAGE                            52         52     <0.01     125      2D gel databases                             
                    PptaseDB                               34         34     <0.01     126      Protein family/group databases               
                    PRIDE                               69531      69531      0.13      38      Proteomic databases                          
                    PRINTS                             138523     120110      0.26      25      Family and domain databases                  
                    ProDom                              27869      27690      0.05      50      Family and domain databases                  
                    ProMEX481        481     <0.01     108      Proteomic databases                          
                    PROSITE                            471868     299364      0.89       7      Family and domain databases                  
                    ProtClustDB                        340477     340477      0.64      14      Phylogenomic databases                       
                    ProteinModelPortal                 425501     425501      0.80       9      3D structure databases                       
                    PseudoCAP                            1224       1215     <0.01      95      Organism-specific databases                  
                    Rat-heart-2DPAGE                       28         28     <0.01     127      2D gel databases                             
                    Reactome                             9367       5611      0.02      68      Enzyme and pathway databases                 
                    REBASE443        399     <0.01     110      Protein family/group databases               
                    RefSeq                             504753     461605      0.95       5      Sequence databases                           
                    REPRODUCTION-2DPAGE                  1256       1035     <0.01      94      2D gel databases                             
                    RGD 7535       7531      0.01      69      Organism-specific databases                  
                    SGD 6638       6633      0.01      72      Organism-specific databases                  
                    Siena-2DPAGE                          102        102     <0.01     120      2D gel databases                             
                    SMART                              156741     119308      0.29      24      Family and domain databases                  
                    SMR349505     349505      0.66      13      3D structure databases                       
                    STRING                             307612     307612      0.58      18      Protein-protein interaction databases        
                    SUPFAM                             325590     258893      0.61      15      Family and domain databases                  
                    SWISS-2DPAGE                         1184       1183     <0.01      96      2D gel databases                             
                    TAIR10669      10583      0.02      67      Organism-specific databases                  
                    TCDB3579       3565      0.01      88      Protein family/group databases               
                    TIGR34364      33591      0.06      47      Genome annotation databases                  
                    TIGRFAMs                           285754     265667      0.54      19      Family and domain databases                  
                    TubercuList                          1892       1856     <0.01      91      Organism-specific databases                  
                    UCD-2DPAGE                            510        501     <0.01     107      2D gel databases                             
                    UCSC47642      37089      0.09      44      Genome annotation databases                  
                    UniGene                             94563      86843      0.18      30      Sequence databases                           
                    VectorBase                            537        524     <0.01     106      Genome annotation databases                  
                    World-2DPAGE                          917        906     <0.01      98      2D gel databases                             
                    WormBase                             4654       3825      0.01      79      Organism-specific databases                  
                    Xenbase                              4651       4628      0.01      80      Organism-specific databases                  
                    ZFIN2670       2658      0.01      90      Organism-specific databases                  
                    
                    Total number of cross-referenced databases: 128
                    
                    6.  AMINO ACID COMPOSITION
                    
                    6.1  Composition in percent for the complete database
                    
                    Ala (A) 8.26   Gln (Q) 3.93   Leu (L) 9.66   Ser (S) 6.55
                    Arg (R) 5.53   Glu (E) 6.75   Lys (K) 5.85   Thr (T) 5.34
                    Asn (N) 4.06   Gly (G) 7.08   Met (M) 2.42   Trp (W) 1.08
                    Asp (D) 5.46   His (H) 2.27   Phe (F) 3.86   Tyr (Y) 2.92
                    Cys (C) 1.36   Ile (I) 5.97   Pro (P) 4.70   Val (V) 6.87
                    
                    Asx (B) 0.000  Glx (Z) 0.000  Xaa (X) 0.00
                    
                    
                    
                    Legend: gray = aliphatic, red = acidic, green = small hydroxy,
                    blue = basic, black = aromatic, white = amide, yellow = sulfur
                    
                    
                    6.2  Classification of the amino acids by their frequency
                    
                    Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
                    Phe, Tyr, Met, His, Cys, Trp
                    
                    
                    7.  MISCELLANEOUS STATISTICS
                    
                    4447 entries are encoded on a mitochondrion, and 3629 are encoded on a plasmid.
                    
                    12182 entries are encoded on a plastid, 
                    of which 21 are encoded on apicoplasts, 
                    11618 on chloroplasts, 
                    50 on organellar chromatophores,
                    145 on cyanelles, 
                    149 on non-photosynthetic plastids and 
                    199 on unspecified types of plastid.
                    
                    Number of entries with at least one sequence correction: 72207