Email updates

Keep up to date with the latest news and content from Biology Direct and BioMed Central.

Open Access Research

RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

Gelio Alves, Aleksey Y Ogurtsov and Yi-Kuo Yu*

Author Affiliations

National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894

For all author emails, please log on.

Biology Direct 2007, 2:25  doi:10.1186/1745-6150-2-25

Published: 25 October 2007

Abstract

Background

The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides.

Results

Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request.