doi: 10.1002/jcc.20929pmid: 18432623
By using the composite vector with increment of diversity, position conservation scoring function, and predictive secondary structures to express the information of sequence, a support vector machine (SVM) algorithm for predicting β‐ and γ‐turns in the proteins is proposed. The 426 and 320 nonhomologous protein chains described by Guruprasad and Rajkumar (Guruprasad and Rajkumar J. Biosci 2000, 25,143) are used for training and testing the predictive model of the β‐ and γ‐turns, respectively. The overall prediction accuracy and the Matthews correlation coefficient in 7‐fold cross‐validation are 79.8% and 0.47, respectively, for the β‐turns. The overall prediction accuracy in 5‐fold cross‐validation is 61.0% for the γ‐turns. These results are significantly higher than the other algorithms in the prediction of β‐ and γ‐turns using the same datasets. In addition, the 547 and 823 nonhomologous protein chains described by Fuchs and Alix (Fuchs and Alix Proteins: Struct Funct Bioinform 2005, 59, 828) are used for training and testing the predictive model of the β‐ and γ‐turns, and better results are obtained. This algorithm may be helpful to improve the performance of protein turns' prediction. To ensure the ability of the SVM method to correctly classify β‐turn and non‐β‐turn (γ‐turn and non‐γ‐turn), the receiver operating characteristic threshold independent measure curves are provided. © 2008 Wiley Periodicals, Inc. J Comput Chem 2008