Purine biosynthesis in archaea: variations on a theme
1 Department of Chemistry, Roanoke College, Salem, VA 24153, USA
2 Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0308, USA
3 Department of Cell and Molecular Physiology, University of North Carolina at Chapel Hill, School of Medicine, Chapel Hill, NC 27599, USA
Biology Direct 2011, 6:63 doi:10.1186/1745-6150-6-63Published: 14 December 2011
The ability to perform de novo biosynthesis of purines is present in organisms in all three domains of life, reflecting the essentiality of these molecules to life. Although the pathway is quite similar in eukaryotes and bacteria, the archaeal pathway is more variable. A careful manual curation of genes in this pathway demonstrates the value of manual curation in archaea, even in pathways that have been well-studied in other domains.
We searched the Integrated Microbial Genome system (IMG) for the 17 distinct genes involved in the 11 steps of de novo purine biosynthesis in 65 sequenced archaea, finding 738 predicted proteins with sequence similarity to known purine biosynthesis enzymes. Each sequence was manually inspected for the presence of active site residues and other residues known or suspected to be required for function.
Many apparently purine-biosynthesizing archaea lack evidence for a single enzyme, either glycinamide ribonucleotide formyltransferase or inosine monophosphate cyclohydrolase, suggesting that there are at least two more gene variants in the purine biosynthetic pathway to discover. Variations in domain arrangement of formylglycinamidine ribonucleotide synthetase and substantial problems in aminoimidazole carboxamide ribonucleotide formyltransferase and inosine monophosphate cyclohydrolase assignments were also identified.
Manual curation revealed some overly specific annotations in the IMG gene product name, with predicted proteins without essential active site residues assigned product names implying enzymatic activity (21 proteins, 2.8% of proteins inspected) or Enzyme Commission (E. C.) numbers (57 proteins, 7.7%). There were also 57 proteins (7.7%) assigned overly generic names and 78 proteins (10.6%) without E.C. numbers as part of the assigned name when a specific enzyme name and E. C. number were well-justified.
The patchy distribution of purine biosynthetic genes in archaea is consistent with a pathway that has been shaped by horizontal gene transfer, duplication, and gene loss. Our results indicate that manual curation can improve upon automated annotation for a small number of automatically-annotated proteins and can reveal a need to identify further pathway components even in well-studied pathways.
This article was reviewed by Dr. Céline Brochier-Armanet, Dr Kira S Makarova (nominated by Dr. Eugene Koonin), and Dr. Michael Galperin.