Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation

Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation Motivation: Automatic classification of high-resolution mass spectrometry proteomic data has increasing potential in the early diagnosis of cancer. We propose a new procedure of biomarker discovery in serum protein profiles based on: (i) discrete wavelet transformation of the spectra; (ii) selection of discriminative wavelet coefficients by a statistical test and (iii) building and evaluating a support vector machine classifier by double cross-validation with attention to the generalizability of the results. In addition to the evaluation results (total recognition rate, sensitivity and specificity), the procedure provides the biomarker patterns, i.e. the parts of spectra which discriminate cancer and control individuals. The evaluation was performed on matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) serum protein profiles of 66 colorectal cancer patients and 50 controls.Results: Our procedure provided a high recognition rate (97.3%), sensitivity (98.4%) and specificity (95.8%). The extracted biomarker patterns mostly represent the peaks expressing mean differences between the cancer and control spectra. However, we showed that the discriminative power of a peak is not simply expressed by its mean height and cannot be derived by comparison of the mean spectra. The obtained classifiers have high generalization power as measured by the number of support vectors. This prevents overfitting and contributes to the reproducibility of the results, which is required to find biomarkers differentiating cancer patients from healthy individuals.Availability: The data and scripts used in this study are available at http://www.math.uni-bremen.de/~theodore/MALDIDWT.Contact: theodore@math.uni-bremen.deSupplementary information: Supplementary data are available at Bioinformatics online. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation

7 pages

Loading next page...
 
/lp/oxford-university-press/biomarker-discovery-in-maldi-tof-serum-protein-profiles-using-discrete-r4xQYQ1SCx

References (15)

Publisher
Oxford University Press
Copyright
© 2009 The Author(s)
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/btn662
pmid
19244390
Publisher site
See Article on Publisher Site

Abstract

Motivation: Automatic classification of high-resolution mass spectrometry proteomic data has increasing potential in the early diagnosis of cancer. We propose a new procedure of biomarker discovery in serum protein profiles based on: (i) discrete wavelet transformation of the spectra; (ii) selection of discriminative wavelet coefficients by a statistical test and (iii) building and evaluating a support vector machine classifier by double cross-validation with attention to the generalizability of the results. In addition to the evaluation results (total recognition rate, sensitivity and specificity), the procedure provides the biomarker patterns, i.e. the parts of spectra which discriminate cancer and control individuals. The evaluation was performed on matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) serum protein profiles of 66 colorectal cancer patients and 50 controls.Results: Our procedure provided a high recognition rate (97.3%), sensitivity (98.4%) and specificity (95.8%). The extracted biomarker patterns mostly represent the peaks expressing mean differences between the cancer and control spectra. However, we showed that the discriminative power of a peak is not simply expressed by its mean height and cannot be derived by comparison of the mean spectra. The obtained classifiers have high generalization power as measured by the number of support vectors. This prevents overfitting and contributes to the reproducibility of the results, which is required to find biomarkers differentiating cancer patients from healthy individuals.Availability: The data and scripts used in this study are available at http://www.math.uni-bremen.de/~theodore/MALDIDWT.Contact: theodore@math.uni-bremen.deSupplementary information: Supplementary data are available at Bioinformatics online.

Journal

BioinformaticsOxford University Press

Published: Jan 6, 2009

There are no references for this article.