Open Access Highly Accessed Open Badges Research

Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale

Evgeny Shmelkov12, Zuojian Tang2, Iannis Aifantis3 and Alexander Statnikov24*

Author Affiliations

1 Department of Pharmacology, New York University School of Medicine, New York, NY, USA

2 Center for Health Informatics and Bioinformatics, New York University School of Medicine, New York, NY, USA

3 Howard Hughes Medical Institute and Department of Pathology, New York University School of Medicine, New York, NY, USA

4 Department of Medicine, New York University School of Medicine, New York, NY, USA

For all author emails, please log on.

Biology Direct 2011, 6:15  doi:10.1186/1745-6150-6-15

Published: 28 February 2011



Pathway databases are becoming increasingly important and almost omnipresent in most types of biological and translational research. However, little is known about the quality and completeness of pathways stored in these databases. The present study conducts a comprehensive assessment of transcriptional regulatory pathways in humans for seven well-studied transcription factors: MYC, NOTCH1, BCL6, TP53, AR, STAT1, and RELA. The employed benchmarking methodology first involves integrating genome-wide binding with functional gene expression data to derive direct targets of transcription factors. Then the lists of experimentally obtained direct targets are compared with relevant lists of transcriptional targets from 10 commonly used pathway databases.


The results of this study show that for the majority of pathway databases, the overlap between experimentally obtained target genes and targets reported in transcriptional regulatory pathway databases is surprisingly small and often is not statistically significant. The only exception is MetaCore pathway database which yields statistically significant intersection with experimental results in 84% cases. Additionally, we suggest that the lists of experimentally derived direct targets obtained in this study can be used to reveal new biological insight in transcriptional regulation and suggest novel putative therapeutic targets in cancer.


Our study opens a debate on validity of using many popular pathway databases to obtain transcriptional regulatory targets. We conclude that the choice of pathway databases should be informed by solid scientific evidence and rigorous empirical evaluation.


This article was reviewed by Prof. Wing Hung Wong, Dr. Thiago Motta Venancio (nominated by Dr. L Aravind), and Prof. Geoff J McLachlan.