Access the full text.
Sign up today, get DeepDyve free for 14 days.
Michael Wright, David Han, R. Aebersold (2005)
Mass Spectrometry-based Expression Profiling of Clinical Prostate CancerMolecular & Cellular Proteomics, 4
S. Surinova, R. Schiess, Ruth Hüttenhain, Ferdinando Cerciello, B. Wollscheid, R. Aebersold (2011)
On the development of plasma protein biomarkers.Journal of proteome research, 10 1
Feng Chen, Jihua Xue, Linfu Zhou, Shanshan Wu, Z. Chen (2011)
Identification of serum biomarkers of hepatocarcinoma through liquid chromatography/mass spectrometry-based metabonomic methodAnalytical and Bioanalytical Chemistry, 401
K. Podwojski, M. Eisenacher, M. Kohl, Michael Turewicz, H. Meyer, J. Rahnenführer, C. Stephan (2010)
Peek a peak: a glance at statistics for quantitative label-free proteomicsExpert Review of Proteomics, 7
L. Breiman (2001)
Random ForestsMachine Learning, 45
Diamandis (2012)
The failure of protein cancer biomarkers to reach the clinic: why, and what can be done to address the problem?BMC Med., 10
Carolin Strobl, A. Boulesteix, Thomas Kneib, Thomas Augustin, Achim Zeileis (2008)
Conditional variable importance for random forestsBMC Bioinformatics, 9
Bin Chen, R. Sheridan, V. Hornak, J. Voigt (2012)
Comparison of Random Forest and Pipeline Pilot Naïve Bayes in Prospective QSAR PredictionsJournal of chemical information and modeling, 52 3
Baolin Wu, T. Abbott, D. Fishman, W. McMurray, G. Mor, K. Stone, D. Ward, K. Williams, Hongyu Zhao (2003)
Comparison of statistical methods for classification of ovarian cancer using mass spectrometry dataBioinformatics, 19 13
(2002)
Selection bias in gene extraction on the basis of microarray gene-expression data
D. Sampson, T. Parker, Z. Upton, C. Hurst (2011)
A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning ApproachesPLoS ONE, 6
Monika Jelizarow, V. Guillemot, A. Tenenhaus, K. Strimmer, A. Boulesteix (2010)
Over-optimism in bioinformatics: an illustrationBioinformatics, 26 16
X. Robin, N. Turck, A. Hainard, F. Lisacek, Jean-Charles Sanchez, Markus Müller (2009)
Bioinformatics for protein biomarker panel classification: what is needed to bring biomarker panels into in vitro diagnostics?Expert Review of Proteomics, 6
M. Hilario, Alexandros Kalousis, C. Pellegrini, Markus Müller (2006)
Processing and classification of protein mass spectra.Mass spectrometry reviews, 25 3
D. Böhm, K. Keller, Nelli Wehrwein, A. Lebrecht, Marcus Schmidt, H. Kölbl, F. Grus (2011)
Serum proteome profiling of primary breast cancer indicates a specific biomarker profile.Oncology reports, 26 5
L. Long, Ru Li, Yongzhe Li, Chao-jun Hu, Zhanguo Li (2011)
Pattern-based diagnosis and screening of differentially expressed serum proteins for rheumatoid arthritis by proteomic fingerprintingRheumatology International, 31
S. Oon, S. Pennington, J. Fitzpatrick, R. Watson (2011)
Biomarker research in prostate cancer—towards utility, not futilityNature Reviews Urology, 8
J. Kaiser (2012)
Clinical medicine. Biomarker tests need closer scrutiny, IOM concludes.Science, 335 6076
R. Caruana, Nikolaos Karampatziakis, Ainur Yessenalina (2008)
An empirical evaluation of supervised learning in high dimensions
L. Lausser, Christoph Müssel, Markus Maucher, H. Kestler (2013)
Measuring and visualizing the stability of biomarker selection techniquesComputational Statistics, 28
N. Rifai, Michael Gillette, S. Carr (2006)
Protein biomarker discovery and validation: the long and uncertain path to clinical utilityNature Biotechnology, 24
R. Tibshirani, T. Hastie, B. Narasimhan, S. Soltys, G. Shi, A. Koong, Q. Le (2004)
Sample classification from protein mass spectrometry, by 'peak probability contrasts'Bioinformatics, 20 17
Jianjie Ma, Yun-Bo Shi
C O M M E N T a R Y Open Access
Proteomic biomarker discovery has led to the identification of numerous potential candidates for disease diagnosis, prognosis, and prediction of response to therapy. However, very few of these identified candidate biomarkers reach clinical validation and go on to be routinely used in clinical practice. One particular issue with biomarker discovery is the identification of significantly changing proteins in the initial discovery experiment that do not validate when subsequently tested on separate patient sample cohorts. Here, we seek to highlight some of the statistical challenges surrounding the analysis of LC‐MS proteomic data for biomarker candidate discovery. We show that common statistical algorithms run on data with low sample sizes can overfit and yield misleading misclassification rates and AUC values. A common solution to this problem is to prefilter variables (via, e.g. ANOVA and or use of correction methods such as Bonferonni or false discovery rate) to give a smaller dataset and reduce the size of the apparent statistical challenge. However, we show that this exacerbates the problem yielding even higher performance metrics while reducing the predictive accuracy of the biomarker panel. To illustrate some of these limitations, we have run simulation analyses with known biomarkers. For our chosen algorithm (random forests), we show that the above problems are substantially reduced if a sufficient number of samples are analyzed and the data are not prefiltered. Our view is that LC‐MS proteomic biomarker discovery data should be analyzed without prefiltering and that increasing the sample size in biomarker discovery experiments should be a very high priority.
Proteomics – Wiley
Published: Jul 1, 2014
Keywords: ; ; ; ; ;
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.