Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Alex Bateman

Senior Team Leader Protein Sequence Resources and Group Leader Bateman Research Group.


EMBL Outstation European Bioinformatics Institute (EMBL/EBI)
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD
United Kingdom
Phone: +44 (0)1223 379 494100
Fax: +44 (0)1223 494469

Short biography

Dr. Alex Bateman has a PhD from the University of Cambridge. During 15 years at the Sanger Institute, he led the development of the widely used Pfam protein family database. In 2003 he started the Rfam database for RNA families that for the first time allowed large scale annotation of the non-coding RNA genome. In 2010 he received the Benjamin Franklin Award for Open Data in the Life Sciences. In 2012 Dr. Bateman joined the European Bioinformatics Institute (EMBL-EBI) and became Head of Protein Sequence Resources. Dr. Bateman was the Executive Editor for the journal Bioinformatics from 2004 to 2012 and spent four years acting as the Editor of the Nucleic Acids Database Issue. He recently edited a book on protein families with Christine Orengo. He has over 140 publications that have received over 50,000 citations.


The Protein Sequence Resources cluster at the EMBL-EBI includes world-leading databases including UniProt, InterPro, Pfam and Rfam. In addition the work of the Bateman Research Group focuses on the analysis of protein and RNA sequences, to understand how they evolve, function and interact.

Selected publications

  1. Das D, Murzin AG, Rawlings ND, Finn RD, Coggill P, Bateman A, Godzik A, Aravind L: Structure and computational analysis of a novel protein with metallopeptidase-like and circularly permuted winged-helix-turn-helix domains reveals a possible role in modified polysaccharide biosynthesis. BMC Bioinformatics, 2014, 15: 75. doi: 10.1186/1471-2105-15-75.
  2. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M: Pfam: the protein families database. Nucleic Acids Res. 2014, 42: D222-230. doi: 10.1093/nar/gkt1223.
  3. Finn RD, Miller BL, Clements J, Bateman A: iPfam: a database of protein family and domain interactions found in the Protein Data Bank. Nucleic Acids Res. 2014, 42: D364-373. doi: 10.1093/nar/gkt1210.
  4. Mitchell A, Chang HY, Daugherty L, Fraser M, Hunter S, Lopez R, McAnulla C, McMenamin C, Nuka G, Pesseat S, Sangrador-Vegas A, Scheremetjew M, Rato C, Yong SY, Bateman A, Punta M, Attwood TK, Sigrist CJ, Redaschi N, Rivoire C, Xenarios I, Kahn D, Guyot D, Bork P, Letunic I, Gough J, Oates M, Haft D, Huang H, Natale DA, Wu CH, Orengo C, Sillitoe I, Mi H, Thomas PD, Finn RD: The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res., 2014 Nov 26. pii: gku1243.
  5. Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD: Rfam 12.0: updates to the RNA families database. Nucleic Acids Res., 2014 Nov 11. pii: gku1063.
  6. Poux S, Magrane M, Arighi CN, Bridge A, O’Donovan C, Laiho K; UniProt Consortium: Expert curation in UniProtKB: a case study on dealing with conflicting and erroneous data. Database (Oxford), 2014, 2014: bau016. doi: 10.1093/database/bau016.
  7. Rawlings ND, Waller M, Barrett AJ, Bateman A: MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014, 42: D503-509. doi: 10.1093/nar/gkt953.
  8. Schreiber F, Patricio M, Muffato M, Pignatelli M, Bateman A: TreeFam v9: a new website, more species and orthology-on-the-fly. Nucleic Acids Res. 2014, 42: D922-925. doi: 10.1093/nar/gkt1055.
  9. UniProt Consortium Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res. 2014, 42: D191-198. doi: 10.1093/nar/gkt1140.
  10. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A: Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013, 41: D226-232. doi: 10.1093/nar/gks1005.
  11. Eberhardt RY, Chang Y, Bateman A, Murzin AG, Axelrod HL, Hwang WC, Aravind L: Filling out the structural map of the NTF2-like superfamily. BMC Bioinformatics 2013, 14: 327. doi: 10.1186/1471-2105-14-327.
  12. Hwang WC, Bakolitsa C, Punta M, Coggill PC, Bateman A, Axelrod HL, Rawlings ND, Sedova M, Peterson S. Eberhardt RY, Aravind L, Pascual J, Godzik A: LUD, a new protein domain associated with lactatae utilization. BMC Bioinformatics 2013, 14:341. doi: 10.1186/1471-2105-14-341.
  13. Mistry J, Coggill P, Eberhardt RY, Deiana A, Giansanti A, Finn RD, Bateman A, Punta M: The challenge of increasing Pfam coverage of the human proteome. Database (Oxford) 2013, 2013:bat023. doi: 10.1093/database/bat023.
  14. Mistry J, Finn RD, Eddy SR, Bateman A, Punta M: Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013, 41:e121. doi: 10.1093/nar/gkt263.
  15. Buljan M, Chalancon B, Eustermann S, Wagner GP, Fuxreiter M, Bateman A, Babu MM: Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol. Cell, 2012, 46: 871-883.
  16. Buljan M, Frankish A, Bateman A: Quantifying the mechanisms of domain gain in animal protein. Genome Biol., 2010, 11: R74.
  17. Bateman A, Finn RD, Sims PJ, Wiedmer T, Biegert A, Soeding J: Phospholipid scramblases and Tubby-like proteins belong to a new superfamily of membrane tethered transcription factors. Bioinformatics, 2009, 25: 159-162.
  18. Schuster-Boeckler B, Bateman A: Protein interactions in human genetic diseases. Genome Biol., 2008, 9: R9.
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again