DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

Douglas E.V. Pires; David B. Ascher; Tom L. Blundell

doi:10.1093/nar/gku411

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

Pires, Douglas E.V.; Ascher, David B.; Blundell, Tom L. 2014-07-01 00:00:00 W314–W319 Nucleic Acids Research, 2014, Vol. 42, Web Server issue Published online 14 May 2014 doi: 10.1093/nar/gku411 DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach 1,* 1,2 1,* Douglas E.V. Pires , David B. Ascher and Tom L. Blundell 1 2 Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK and ACRF Rational Drug Discovery Centre and Biota Structural Biology Laboratory, St Vincents Institute of Medical Research, Fitzroy, VIC 3065, Australia Received March 1, 2014; Revised April 28, 2014; Accepted April 29, 2014 ABSTRACT generated from cancer genome and other sequencing initia- tives (3,4) requires an accurate and scalable computational Cancer genome and other sequencing initiatives are approach to understanding structural effects of mutations generating extensive data on non-synonymous sin- and correlating them with disease on the scale of the whole gle nucleotide polymorphisms (nsSNPs) in human proteome (5). Such a computational approach should also and other genomes. In order to understand the im- be useful in the development of engineered proteins with pacts of nsSNPs on the structure and function of improved, modified or optimized functions ( 6). the proteome, as well as to guide protein engineer- Over the past fifteen years, several different in silico meth- ods for predicting the influence of mutations on protein ing, accurate in silicomethodologies are required stability have been proposed based on various evolution- to study and predict their effects on protein stabil- ary and physical chemical hypotheses (7–15), but none has ity. Despite the diversity of available computational proven on its own to be accurate in all situations where mu- methods in the literature, none has proven accu- tational analysis is required. For this reason, one may expect rate and dependable on its own under all scenarios to obtain a more accurate prediction by combining methods where mutation analysis is required. Here we present that are based on different paradigms and that exploit dif- DUET, a web server for an integrated computational ferent protein structural properties (16), in order to reach a approach to study missense mutations in proteins. consensus on the understanding of mutation effects by an DUET consolidates two complementary approaches integrated computational approach. As highlighted in (15), (mCSM and SDM) in a consensus prediction, ob- the methods mCSM and SDM (7,14) are complementary since they measure different properties and are built upon tained by combining the results of the separate meth- different perspectives; a combined predictor should there- ods in an optimized predictor using Support Vector fore improve overall performance. Machines (SVM). We demonstrate that the proposed Here, we present DUET, an integrated computational ap- method improves overall accuracy of the predictions proach for predicting effects of missense mutations on pro- in comparison with either method individually and tein stability. DUET combines mCSM and SDM in a con- performs as well as or better than similar methods. sensus prediction, by consolidating the results of the sepa- The DUET web server is freely and openly available rate methods in an optimized predictor using Support Vec- at http://structure.bioc.cam.ac.uk/duet. tor Machines (SVMs) trained with Sequential Minimal Op- timization (17). DUET was trained on a low-redundancy data set of mu- INTRODUCTION tations with available experimental thermodynamic data derived from the ProTherm database (18) and validated In this era of high-throughput data generation, the ability with blind test sets, achieving a Pearsons correlation coef- to predict accurately the impacts of non-synonymous sin- ficient of up to 0.74 during training and 0.71 in the test set gle nucleotide polymorphisms (nsSNPs) on protein stabil- (0.82 and 0.79 after 10% outlier removal, respectively). We ity is an essential tool for understanding the effects of hu- demonstrate that DUET improves overall accuracy of the man genome variation (1), particularly with respect to per- predictions in comparison with either method on its own. sonalized medicine and the mechanisms of variable drug re- We also show that DUET, by selectively combining two sponse in humans (2). The enormous amount of data being To whom correspondence should be addressed. Tel: +44 1223 766 033; Fax: +44 1223 766 002; Email: [email protected] Correspondence may also be addressed to Tom L. Blundell. Tel: +44 1223 333628; Fax: +44 1223 766 002; Email: [email protected] C The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2014, Vol. 42, Web Server issue W315 methods, significantly outperforms another integrated ap- Finally, the mCSM and optimized SDM predictions, to- proach that combines seven methods (16). A web server for gether with secondary structure from SDM and the phar- DUET is available at http://bleoberis.bioc.cam.ac.uk/duet. macophore vector from mCSM are fed to the SVM al- gorithm, generating a combined output from a supervised learning scheme. The experimental thermodynamic data for MATERIALS AND METHODS each mutation in training and test sets are used to evaluate the accuracy of the combined method. SDM The method SDM, introduced in (7,14), relies on amino WEB SERVER acid propensities derived from environment-specific sub- stitution tables for homologous protein families that feed Input a statistical potential energy function and encompass an In order to run a prediction on the DUET server, the evolutionary view of the constraints from the immediate user submits a PDB structure or 4-letter code of the wild- residue environment. The approach compares amino acid type protein of interest, as well as the mutation informa- propensities for the wild-type and mutant proteins in the tion (residue position, wild-type and mutant residues codes folded and unfolded states in order to estimate the free en- in one-letter format) and chain identifier. Users also have ergy differences between wild type and mutant. The website the option to perform systematic mutations of a particular is at: http://www-cryst.bioc.cam.ac.uk/ sdm/sdm.php. residue to all 19 possible mutants. DUET supports nuclear magnetic resonance structures but only the first model will mCSM be taken into account. Users are encouraged to submit PDB files with a single chain with the exception of cases of pro- mCSM is a machine learning method to predict the effects teins that fold upon binding (coupled folding and binding of of missense mutations based on structural signatures (15). intrinsically disordered proteins (24)). A help page to assist The mCSM signatures were derived from the graph-based users on how to run and interpret the results of the predic- concept of Cutoff Scanning Matrix (CSM) (19), originally tions is available on the top navigation bar. proposed to represent network topology by distance pat- terns in the study of biological systems. mCSM uses a graph representation of the wild-type residue environment to ex- Output tract geometric and physicochemical patterns (the last rep- As shown in Figure 2, the server displays in the out- resented in terms of pharmacophores) that are then used to put page the predictions from the individual methods, the represent the 3D chemical environment during supervised combined/consensus prediction obtained by DUET and learning. These signatures have been successfully applied in an interactive visualization of the uploaded PDB file via a range of tasks including protein structural classification GLMol. This interface allows the user to visualize the pro- and function prediction (20), as well as large-scale receptor- tein with molecules represented in several ways, such as based protein ligand prediction (21). ThemCSMwebsite is ‘cartoon’, ‘ball and stick’ and ‘spheres’ as well as to take available at: http://structure.bioc.cam.ac.uk/mcsm. snapshots. The predicted results are expressed as the varia- tion in Gibbs Free Energy (G) and negative values de- DUET-Integrated Computational Approach note destabilizing mutations. Complementary information such as residue relative solvent accessibility (RSA, calcu- Figure 1 shows the workflow of the developed methodology. lated using the Richards method (25)), side-chain hydro- Given a single point mutation in a protein structure, DUET gen bond satisfaction and secondary structure (programme calculates a combined/consensus prediction by combining SSTRUC) are calculated and shown. The user also has the the predictions from two methods (mCSM and SDM) in a option of downloading the structure of the mutant protein non-linear way, using SVM regression with a Radial Basis generated by the programme ANDANTE (26), as required Function kernel (22). for the method SDM. In order to do so, complementary information regarding the mutation, such as secondary structure (used by SDM) and a pharmacophore vector that accounts for the changes VALIDATION between wild-type and mutant residue (used by mCSM) are Mutation Data sets also calculated and used by DUET. As described previously (15), the pharmacophore vector is obtained by comparing DUET’s regression model was trained on data for muta- the frequency of eight possible atom characteristics between tions derived from the ProTherm database (18) and used wild-type and mutant residues (hydrophobic, positive, neg- in a previous study (15). The training set is formed by 2297 ative, hydrogen acceptor, hydrogen donor, aromatic, sul- randomly selected mutations drawn from the S2848 data set phur and neutral). used by the PoPMuSiC method (13). To minimize the risk As a filtering step, residue relative solvent accessibility of overfitting, two blind test sets were devised to validate (RSA) is used to optimize the standard SDM predictions the method. The first data set was composed of 351 non- using a regression model tree before combining it with redundant mutations at position level, meaning that muta- mCSM. The M5P algorithm (23) was used to generate the tions in a given position are either in the training or test regression tree which improved the SDM performance on set exclusively. More information about the data sets used the blind test from r = 0.56 to r = 0.62. can be found in Section 1 in Supplementary Material. In W316 Nucleic Acids Research, 2014, Vol. 42, Web Server issue Figure 1. DUET workflow for obtaining a consensus prediction for a single point mutation. The grey and the blue boxes denote the server’s input and output, respectively. Green boxes denote intermediate prediction values used by DUET and yellow boxes denote complementary information used to optimize SDM prediction or by DUET. order to perform a comparative test between DUET and RESULTS iStable (16), we used a dataset of mutations on the p53 pro- Figure 3 shows regression analysis for the stability predic- tein, a transcription factor whose loss of function is corre- tions generated by DUET in comparison with the experi- lated with tumourigenesis which was assembled in a previ- mentally measured variation in stability for the considered ous study (15). This data set contained 42 mutations within data sets. During training, DUET achieved a Pearson’s cor- the DNA binding domain of the tumour suppressor p53 relation coefficient of r = 0.74 with a standard error of σ protein with experimentally characterized thermodynamic = 0.98 kcal/mol, significantly better than mCSM ( r = 0.69, effects available in the scientific literature. None of these mu- σ = 1.06 kcal/mol. See Section 2 in Supplementary Ma- tations was present in the training set. terial). Furthermore, a correlation of r = 0.82 with stan- Nucleic Acids Research, 2014, Vol. 42, Web Server issue W317 Figure 2. Result page for DUET prediction. The results display the predicted change in folding free energy upon mutation (G in kcal/mol). A positive value (and red writing) corresponds to a mutation predicted as destabilizing; while a negative sign (and blue writing) corresponds to a mutation predicted as stabilizing. The information displayed include the mCSM (i) and SDM (ii) individually predicted protein stability changes, the combined DUET prediction (iii), a structural summary of the mutation highlighting the wild-type residue and position number, the mutation and its 3D environment (iv). The protein and mutation can also be visualized (v), or a PDB file of the mutant downloaded for viewing in your preferred molecular visualization software. Table 1. Comparative prediction performance of methods on P53 data set a a Method Pearson’s coefficient Standard error kcal/mol mCSM 0.68 / 0.72 1.40 / 1.20 SDM 0.52 / 0.64 1.61 / 1.32 iStable 0.49 / 0.64 1.59 / 1.37 DUET 0.68 / 0.76 1.39 / 1.13 The two values given per column correspond respectively to the whole validation set of 42 mutants and the results after removing 10% of the outliers. Figure 3. Regression analysis between experimental and predicted stability changes by DUET. The left graph show the performance of DUET during training while the right graph shows the predictive performance in two different blind test sets. Pearson’s correlation coefficient ( r) and standard error (σ ) are also shown for each data set. W318 Nucleic Acids Research, 2014, Vol. 42, Web Server issue dard error of σ = 0.72 kcal/mol is obtained after 10% out- the Victorian Government and the Leslie (Les) J. Flem- lier removal. In the first blind test set of 351 non-redundant ing Churchill Fellowship from the The Winston Churchill mutations, DUET achieved a correlation of r = 0.71 (σ = Memorial Trust (to D.B.A.); University of Cambridge and 1.13 kcal/mol, which is considerably higher than the perfor- The Wellcome Trust for facilities and support (093167 to mance of either method individually (r = 0.56 and r = 0.67 T.L.B.). Funding for open access charge: The Wellcome for SDM and mCSM, respectively. See Section 2 in Supple- Trust. mentary Material). The correlation in 90% of the data set Conflict of interest statement. None declared. peaks at r = 0.79 (σ = 0.84 kcal/mol). In order to compare DUET with iStable (16), a recently proposed integrated computational approach, a blind test REFERENCES with p53 mutations was devised. iStable is a meta-predictor 1. Capriotti,E., Nehrt,N.L., Kann,M.G. and Bromberg,Y. (2012) that combines seven different methods using SVM algo- Bioinformatics for personal genome interpretation. Brief Bioinform., rithm, and integrates complementary information such as 13, 495–512. residue solvent accessibility, secondary structure and se- 2. Lahti,J.L., Tang,G.W., Capriotti,E., Liu,T. and Altman,R.B. (2012) quence information. Bioinformatics and variability in drug response: a protein structural Table 1 shows the comparative results between the com- perspective. J. R. Soc. Interface, 9, 1409–1437. 3. Stratton,M.R., Campbell,P.J. and Futreal,P.A. (2012) The cancer putationally integrated approaches DUET and iStable, as genome. Nature, 458, 719–724. well as mCSM and SDM. Even though iStable relies on 4. Hudson,T.J., Anderson,W., Aretz,A., Barker,A.D., Bell,C., the predictions of seven different methods, the approach Bernabe,R.R.M.K., ´ Calvo,F., Eerola,I., Gerhard,D.S. and others achieved a correlation coefficient of only r = 0.49, which is (2012) International network of cancer genome projects. Nature, 464, 993–998. inconsistent with the correlation of r = 0.86 that the authors 5. Casadio,R., Vassura,M., Tiwari,S., Fariselli,P. and Luigi Martelli,P. report during cross-validation. In contrast, DUET achieves (2012) Correlating disease-related mutations to their effect on protein a r = 0.68 (σ = 1.40 kcal/mol), which is consistent with the stability: A large-scale analysis of the human proteome. Hum. Mutat., methods performance during training and blind test valida- 32, 1161–1170. tion. By removing 10% of outliers (only three mutations), 6. Socha,R.D. and Tokuriki,N. (2012) Modulating protein stability–directed evolution strategies for improved protein function. DUET’s correlation coefficient rises to r = 0.77 and stan- FEBS J., 280, 5582–5595. dard error drops to σ = 1.12 kcal/mol, in comparison with 7. Topham,C.M., Srinivasan,N. and Blundell,T.L. (2012) Prediction of a correlation of r = 0.64 (σ = 1.37 kcal/mol) achieved by the stability of protein mutants based on structural iStable. environment-dependent amino acid substitution and propensity tables. Protein Eng., 10, 7–21. 8. Guerois,R., Nielsen,J.E. and Serrano,L. (2012) Predicting changes in CONCLUSIONS the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol., 320, 369–387. DUET is an accurate, free and easy-to-use bioinformatics 9. Capriotti,E., Fariselli,P. and Casadio,R. (2012) A web server created for experts and non-experts alike who are neural-network-based method for predicting protein stability changes interested in gaining insight into the effects of nsSNPs on upon single point mutations. Bioinformatics, 20, i63–i68. 10. Capriotti,E., Fariselli,P. and Casadio,R. (2012) I-Mutant2.0: protein stability. It integrates two complementary methods predicting stability changes upon mutation from the protein sequence into a consensus/optimized prediction, as a way to leverage or structure. Nucleic Acids Res., 33, W306–W310. the best of SDM, a statistical potential energy function that 11. Parthiban,V., Gromiha,M.M. and Schomburg,D. (2012) CUPSAT: relies on substitution tables derived from homologous pro- prediction of protein stability upon point mutations. Nucleic Acids tein families which incorporates constraints on residue envi- Res., 34, W239–W242. 12. Cheng,J., Randall,A. and Baldi,P. (2012) Prediction of protein ronments during evolution, and mCSM, a machine learning stability changes for single-site mutations using support vector algorithm that takes into account the residue 3D phsyco- machines. Proteins, 62, 1125–1132. chemical environment summarized as a graph-based struc- 13. Dehouck,Y., Grosfils,A., Folch,B., Gilis,D., Bogaerts,P. and tural signature. DUET is a valuable tool for a wide variety Rooman,M. (2012) Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural of applications, ranging from protein stability modulation networks: PoPMuSiC-2.0. Bioinformatics, 25, 2537–2543. to understanding the role of mutations in diseases. 14. Worth,C.L., Preissner,R. and Blundell,T.L. (2012) SDM––a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res., 39, W215–W222. SUPPLEMENTARY DATA 15. Pires,D.E.V., Ascher,D.B. and Blundell,T.L. (2012) mCSM: Supplementary Data are available at NAR Online including predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics, 30, 335–342. [1–6]. 16. Chen,C., Lin,J. and Chu,Y. (2012) iStable: off-the-shelf predictor integration for predicting protein stability changes. BMC ACKNOWLEDGEMENT Bioinformatics, 14, S5. 17. Shevade,S.K., Keerthi,S.S., Bhattacharyya,C. and Murthy,K.R.K. The authors thank Harry Jubb and Bernardo Ochoa for (2012) Improvements to the SMO algorithm for SVM regression. helpful discussions and feedback. IEEE Trans. Neural Netw., 11, 1188–1193. 18. Kumar,M.D.S., Bava,K.A., Gromiha,M.M., Prabakaran,P., Kitajima,K., Uedaira,H. and Sarai,A. (2012) ProTherm and ProNIT: FUNDING thermodynamic databases for proteins and protein–nucleic acid interactions. Nucleic Acids Res., 34, D204–D206. Conselho Nacional de Desenvolvimento Cient´ıfico e Tec- 19. da Silveira,C.H., Pires,D.E.V., Melo-Minardi,R.C., Ribeiro,C., nologico ´ (CNPq), Brazil (to D.E.V.P.); NHMRC CJ Mar- Veloso,C.J.M., Lopes,J.C.D., Meira,W. Jr, Neshich,G., tin Fellowship (GNT1072476), Victoria Fellowship from Ramos,C.H.I., Habesch,R. and Santoro,M.M. (2012) Protein cutoff Nucleic Acids Research, 2014, Vol. 42, Web Server issue W319 scanning: A comparative analysis of cutoff dependent and cutoff free 23. Quinlan,J.R. (2012) Learning with continuous classes. In: methods for prospecting contacts in proteins. Proteins, 74, 727–743. Proceedings of the 5th Australian joint Conference on Artificial 20. Pires,D.E.V., Melo-Minardi,R.C., Santos,M.A., da Silveira,C.H., Intelligence, Vol. 92, pp. 343–348. Santoro,M.M. and Meira,W. Jr (2012) Cutoff Scanning Matrix 24. Sugase,K., Dyson,H.J. and Wright,P.E. (2012) Mechanism of coupled (CSM): structural classification and function prediction by protein folding and binding of an intrinsically disordered protein. Nature, inter-residue distance patterns. BMC Genomics, 12, S12. 447, 1021–1025. 21. Pires,D.E.V., de Melo-Minardi,R.C., da Silveira,C.H., Campos,F.F. 25. Lee,B. and Richards,F.M. (2012) The interpretation of protein and Meira,W. Jr (2012) aCSM: noise-free graph-based signatures to structures: estimation of static accessibility. J. Mol. Biol., 55, 379–400. large-scale receptor-based ligand prediction. Bioinformatics, 29, 26. Smith,R.E., Lovell,S.C., Burke,D.F., Montalvao,R.W. and 855–861. Blundell,T.L. (2012) Andante: reducing side-chain rotamer search 22. Scholkopf,B., Sung,K., Burges,C.J.C., Girosi,F., Niyogi,P., Poggio,T. space during comparative modeling using environment-specific and Vapnik,V. (2012) Comparing support vector machines with substitution probabilities. Bioinformatics, 23, 1099–1105. Gaussian kernels to radial basis function classifiers. IEEE Trans. Signal Process., 45, 2758–2765. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Nucleic Acids Research Oxford University Press http://www.deepdyve.com/lp/oxford-university-press/duet-a-server-for-predicting-effects-of-mutations-on-protein-stability-BIGDmogKzr

Loading next page...

References (32)

Jianlin Cheng, Arlo Randall, P. Baldi (2005)
Prediction of protein stability changes for single‐site mutations using support vector machines
Proteins: Structure, 62
Chi-Wei Chen, Jerome Lin, Yen-Wei Chu (2013)
iStable: off-the-shelf predictor integration for predicting protein stability changes
BMC Bioinformatics, 14
V. Parthiban, M. Gromiha, D. Schomburg (2006)
CUPSAT: prediction of protein stability upon point mutations
Nucleic Acids Research, 34
Douglas Pires, R. Melo-Minardi, M. Santos, C. Silveira, M. Santoro, Wagner Jr (2011)
Cutoff Scanning Matrix (CSM): structural classification and function prediction by protein inter-residue distance patterns
BMC Genomics, 12
E. Capriotti, P. Fariselli, R. Casadio (2004)
A neural-network-based method for predicting protein stability changes upon single point mutations
Bioinformatics, 20 Suppl 1
E. Capriotti, Nathan Nehrt, M. Kann, Y. Bromberg (2012)
Bioinformatics for personal genome interpretation
Briefings in bioinformatics, 13 4
T. Hudson, W. Anderson, Axel Artez, A. Barker, C. Bell, R. Bernabé, M. Bhan, F. Calvo, I. Eerola, D. Gerhard, A. Guttmacher, M. Guyer, F. Hemsley, Jennifer Jennings, D. Kerr, P. Klatt, Patrik Kolar, Jun Kusada, D. Lane, F. Laplace, Lu Youyong, G. Nettekoven, B. Ozenberger, Jane Peterson, T. Rao, J. Remacle, A. Schafer, T. Shibata, M. Stratton, J. Vockley, Koichi Watanabe, Huanming Yang, M. Yuen, B. Knoppers, M. Bobrow, A. Cambon-Thomsen, L. Dressler, S. Dyke, Y. Joly, Kazuto Kato, Karen Kennedy, Pilar Nicolàs, M. Parker, E. Rial‐Sebbag, C. Romeo-Casabona, K. Shaw, S. Wallace, G. Wiesner, N. Zeps, P. Lichter, A. Biankin, C. Chabannon, L. Chin, B. Clement, E. Álava, F. Degos, M. Ferguson, Peter Geary, D. Hayes, A. Johns, A. Kasprzyk, H. Nakagawa, R. Penny, M. Piris, R. Sarin, A. Scarpa, M. Vijver, P. Futreal, H. Aburatani, M. Bayés, David Botwell, P. Campbell, X. Estivill, S. Grimmond, I. Gut, M. Hirst, C. López-Otín, P. Majumder, M. Marra, J. McPherson, Z. Ning, X. Puente, Y. Ruan, H. Stunnenberg, H. Swerdlow, V. Velculescu, R. Wilson, H. Xue, Liu Yang, P. Spellman, Gary Bader, P. Boutros, Paul Flicek, G. Getz, R. Guigó, Guangwu Guo, D. Haussler, S. Heath, T. Hubbard, T. Jiang, Steven Jones, Qibin Li, N. López-Bigas, Ruibang Luo, L. Muthuswamy, B. Ouellette, J. Pearson, V. Quesada, Benjamin Raphael, C. Sander, T. Speed, Lincoln Stein, Joshua Stuart, J. Teague, Y. Totoki, T. Tsunoda, A. Valencia, D. Wheeler, Honglong Wu, Shancen Zhao, Guangyu Zhou, M. Lathrop, G. Thomas, Teruhiko Yoshida, M. Axton, C. Gunter, L. Miller, Junjun Zhang, Syed Haider, Jianxin Wang, C. Yung, A. Cros, Yong Liang, S. Gnaneshan, J. Guberman, J. Hsu, D. Chalmers, K. Hasel, T. Kaan, W. Lowrance, T. Masui, L. Rodriguez, C. Vergely, D. Bowtell, N. Cloonan, A. deFazio, J. Eshleman, D. Etemadmoghadam, B. Gardiner, J. Kench, R. Sutherland, M. Tempero, N. Waddell, P. Wilson, S. Gallinger, M. Tsao, P. Shaw, G. Petersen, D. Mukhopadhyay, R. DePinho, S. Thayer, K. Shazand, Timothy Beck, M. Sam, Lee Timms, Vanessa Ballin, Youyong Lu, J. Ji, Xiuqing Zhang, Feng Chen, Xueda Hu, Qi Yang, G. Tian, Lianhai Zhang, Xiaofang Xing, Xianghong Li, Zheng‐gang Zhu, Yingyan Yu, Jun Yu, J. Tost, P. Brennan, I. Holcatova, D. Zaridze, A. Brazma, L. Egevard, E. Prokhortchouk, R. Banks, M. Uhlén, Juris Viksna, F. Pontén, K. Skryabin, E. Birney, Å. Borg, A. Børresen-Dale, C. Caldas, J. Foekens, Sancha Martin, J. Reis-Filho, A. Richardson, C. Sotiriou, G. Thoms, L. Veer, D. Birnbaum, H. Blanché, Pascal Boucher, S. Boyault, Jocelyne Masson-Jacquemier, I. Pauporté, X. Pivot, A. Vincent-Salomon, E. Tabone, C. Theillet, I. Treilleux, P. Bioulac-Sage, T. Decaens, D. Franco, M. Gut, Didier Samuel, J. Zucman‐Rossi, R. Eils, B. Brors, J. Korbel, A. Korshunov, P. Landgraf, H. Lehrach, S. Pfister, B. Radlwimmer, G. Reifenberger, Michael Taylor, C. Kalle, P. Majumder, P. Pederzoli, R. Lawlor, M. Delledonne, A. Bardelli, T. Gress, D. Klimstra, G. Zamboni, Y. Nakamura, S. Miyano, Akihiro Fujimoto, E. Campo, S. Sanjosé, E. Montserrat, M. González-Díaz, P. Jares, H. Himmelbauer, S. Beà, S. Aparicio, D. Easton, F. Collins, C. Compton, E. Lander, W. Burke, A. Green, S. Hamilton, O. Kallioniemi, T. Ley, E. Liu, B. Wainwright (2010)
International network of cancer genome projects
Nature, 464
B. Schh, K. Sung, C. Burges, F. Girosi, P. Ogi, T. Poggio, V. Vapnik (1997)
Comparing Support Vector Machines with Gaussian Kernels to Radial Basis Function Classi
Richard Smith, S. Lovell, D. Burke, R. Montalvão, T. Blundell (2007)
Andante: reducing side-chain rotamer search space during comparative modeling using environment-specific substitution probabilities
Bioinformatics, 23 9
J. Steiger (1980)
Tests for comparing elements of a correlation matrix.
Psychological Bulletin, 87
C. Topham, N. Srinivasan, T. Blundell (1997)
Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables.
Protein engineering, 10 1
S. Shevade, S. Keerthi, C. Bhattacharyya, K. Murthy (2000)
Improvements to the SMO algorithm for SVM regression
IEEE transactions on neural networks, 11 5
R. Silva, F. Kim (2015)
The Cancer
International Brazilian Journal of Urology : official journal of the Brazilian Society of Urology, 41
C. Worth, R. Preissner, T. Blundell (2011)
SDM—a server for predicting effects of mutations on protein stability and malfunction
Nucleic Acids Research, 39
Y. Dehouck, Aline Grosfils, B. Folch, D. Gilis, P. Bogaerts, M. Rooman (2009)
Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0
Bioinformatics, 25 19
J. Quinlan (1992)
Learning With Continuous Classes
K. Sugase, H. Dyson, P. Wright (2007)
Mechanism of coupled folding and binding of an intrinsically disordered protein
Nature, 447
C. Silveira, Douglas Pires, R. Minardi, C. Ribeiro, C. Veloso, J. Lopes, Wagner Jr, G. Neshich, C. Ramos, Raul Habesch, M. Santoro (2009)
Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins
Proteins: Structure, 74
M. Kumar, K. Bava, M. Gromiha, P. Prabakaran, K. Kitajima, H. Uedaira, A. Sarai (2005)
ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions
Nucleic Acids Research, 34
(1992)
The generalization of Students
R. Casadio, Marco Vassura, Shalinee Tiwari, P. Fariselli, Pier Martelli (2011)
Correlating disease‐related mutations to their effect on protein stability: A large‐scale analysis of the human proteome
Human Mutation, 32
Jennifer Lahti, Grace Tang, E. Capriotti, Tianyun Liu, R. Altman (2012)
Bioinformatics and variability in drug response: a protein structural perspective
Journal of The Royal Society Interface, 9
IEEE Trans. Neural Netw
D. Pires, D. Ascher, T. Blundell (2013)
mCSM: predicting the effects of mutations in proteins using graph-based signatures
Bioinformatics, 30
(2012)
Protein cutoff Downloaded from https://academic.oup.com/nar/article-abstract/42/W1/W314/2437397 by guest on
File Description: Published version
E. Capriotti, P. Fariselli, R. Casadio (2005)
I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
Nucleic Acids Research, 33
B. Lee, F. Richards (1971)
The interpretation of protein structures: estimation of static accessibility.
Journal of molecular biology, 55 3
Raymond Socha, N. Tokuriki (2013)
Modulating protein stability – directed evolution strategies for improved protein function
The FEBS Journal, 280
Douglas Pires, R. Minardi, C. Silveira, F. Campos, Wagner Jr (2013)
aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction
Bioinformatics, 29 7
M. Stratton, P. Campbell, Peter Campbell, P. Futreal (2009)
The cancer genome
Nature, 458
R. Guérois, J. Nielsen, L. Serrano (2002)
Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations.
Journal of molecular biology, 320 2

Publisher: Oxford University Press
Copyright: The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
ISSN: 0305-1048
eISSN: 1362-4962
DOI: 10.1093/nar/gku411
pmid: 24829462
Publisher site: See Article on Publisher Site

Abstract

W314–W319 Nucleic Acids Research, 2014, Vol. 42, Web Server issue Published online 14 May 2014 doi: 10.1093/nar/gku411 DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach 1,* 1,2 1,* Douglas E.V. Pires , David B. Ascher and Tom L. Blundell 1 2 Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK and ACRF Rational Drug Discovery Centre and Biota Structural Biology Laboratory, St Vincents Institute of Medical Research, Fitzroy, VIC 3065, Australia Received March 1, 2014; Revised April 28, 2014; Accepted April 29, 2014 ABSTRACT generated from cancer genome and other sequencing initia- tives (3,4) requires an accurate and scalable computational Cancer genome and other sequencing initiatives are approach to understanding structural effects of mutations generating extensive data on non-synonymous sin- and correlating them with disease on the scale of the whole gle nucleotide polymorphisms (nsSNPs) in human proteome (5). Such a computational approach should also and other genomes. In order to understand the im- be useful in the development of engineered proteins with pacts of nsSNPs on the structure and function of improved, modified or optimized functions ( 6). the proteome, as well as to guide protein engineer- Over the past fifteen years, several different in silico meth- ods for predicting the influence of mutations on protein ing, accurate in silicomethodologies are required stability have been proposed based on various evolution- to study and predict their effects on protein stabil- ary and physical chemical hypotheses (7–15), but none has ity. Despite the diversity of available computational proven on its own to be accurate in all situations where mu- methods in the literature, none has proven accu- tational analysis is required. For this reason, one may expect rate and dependable on its own under all scenarios to obtain a more accurate prediction by combining methods where mutation analysis is required. Here we present that are based on different paradigms and that exploit dif- DUET, a web server for an integrated computational ferent protein structural properties (16), in order to reach a approach to study missense mutations in proteins. consensus on the understanding of mutation effects by an DUET consolidates two complementary approaches integrated computational approach. As highlighted in (15), (mCSM and SDM) in a consensus prediction, ob- the methods mCSM and SDM (7,14) are complementary since they measure different properties and are built upon tained by combining the results of the separate meth- different perspectives; a combined predictor should there- ods in an optimized predictor using Support Vector fore improve overall performance. Machines (SVM). We demonstrate that the proposed Here, we present DUET, an integrated computational ap- method improves overall accuracy of the predictions proach for predicting effects of missense mutations on pro- in comparison with either method individually and tein stability. DUET combines mCSM and SDM in a con- performs as well as or better than similar methods. sensus prediction, by consolidating the results of the sepa- The DUET web server is freely and openly available rate methods in an optimized predictor using Support Vec- at http://structure.bioc.cam.ac.uk/duet. tor Machines (SVMs) trained with Sequential Minimal Op- timization (17). DUET was trained on a low-redundancy data set of mu- INTRODUCTION tations with available experimental thermodynamic data derived from the ProTherm database (18) and validated In this era of high-throughput data generation, the ability with blind test sets, achieving a Pearsons correlation coef- to predict accurately the impacts of non-synonymous sin- ficient of up to 0.74 during training and 0.71 in the test set gle nucleotide polymorphisms (nsSNPs) on protein stabil- (0.82 and 0.79 after 10% outlier removal, respectively). We ity is an essential tool for understanding the effects of hu- demonstrate that DUET improves overall accuracy of the man genome variation (1), particularly with respect to per- predictions in comparison with either method on its own. sonalized medicine and the mechanisms of variable drug re- We also show that DUET, by selectively combining two sponse in humans (2). The enormous amount of data being To whom correspondence should be addressed. Tel: +44 1223 766 033; Fax: +44 1223 766 002; Email: [email protected] Correspondence may also be addressed to Tom L. Blundell. Tel: +44 1223 333628; Fax: +44 1223 766 002; Email: [email protected] C The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2014, Vol. 42, Web Server issue W315 methods, significantly outperforms another integrated ap- Finally, the mCSM and optimized SDM predictions, to- proach that combines seven methods (16). A web server for gether with secondary structure from SDM and the phar- DUET is available at http://bleoberis.bioc.cam.ac.uk/duet. macophore vector from mCSM are fed to the SVM al- gorithm, generating a combined output from a supervised learning scheme. The experimental thermodynamic data for MATERIALS AND METHODS each mutation in training and test sets are used to evaluate the accuracy of the combined method. SDM The method SDM, introduced in (7,14), relies on amino WEB SERVER acid propensities derived from environment-specific sub- stitution tables for homologous protein families that feed Input a statistical potential energy function and encompass an In order to run a prediction on the DUET server, the evolutionary view of the constraints from the immediate user submits a PDB structure or 4-letter code of the wild- residue environment. The approach compares amino acid type protein of interest, as well as the mutation informa- propensities for the wild-type and mutant proteins in the tion (residue position, wild-type and mutant residues codes folded and unfolded states in order to estimate the free en- in one-letter format) and chain identifier. Users also have ergy differences between wild type and mutant. The website the option to perform systematic mutations of a particular is at: http://www-cryst.bioc.cam.ac.uk/ sdm/sdm.php. residue to all 19 possible mutants. DUET supports nuclear magnetic resonance structures but only the first model will mCSM be taken into account. Users are encouraged to submit PDB files with a single chain with the exception of cases of pro- mCSM is a machine learning method to predict the effects teins that fold upon binding (coupled folding and binding of of missense mutations based on structural signatures (15). intrinsically disordered proteins (24)). A help page to assist The mCSM signatures were derived from the graph-based users on how to run and interpret the results of the predic- concept of Cutoff Scanning Matrix (CSM) (19), originally tions is available on the top navigation bar. proposed to represent network topology by distance pat- terns in the study of biological systems. mCSM uses a graph representation of the wild-type residue environment to ex- Output tract geometric and physicochemical patterns (the last rep- As shown in Figure 2, the server displays in the out- resented in terms of pharmacophores) that are then used to put page the predictions from the individual methods, the represent the 3D chemical environment during supervised combined/consensus prediction obtained by DUET and learning. These signatures have been successfully applied in an interactive visualization of the uploaded PDB file via a range of tasks including protein structural classification GLMol. This interface allows the user to visualize the pro- and function prediction (20), as well as large-scale receptor- tein with molecules represented in several ways, such as based protein ligand prediction (21). ThemCSMwebsite is ‘cartoon’, ‘ball and stick’ and ‘spheres’ as well as to take available at: http://structure.bioc.cam.ac.uk/mcsm. snapshots. The predicted results are expressed as the varia- tion in Gibbs Free Energy (G) and negative values de- DUET-Integrated Computational Approach note destabilizing mutations. Complementary information such as residue relative solvent accessibility (RSA, calcu- Figure 1 shows the workflow of the developed methodology. lated using the Richards method (25)), side-chain hydro- Given a single point mutation in a protein structure, DUET gen bond satisfaction and secondary structure (programme calculates a combined/consensus prediction by combining SSTRUC) are calculated and shown. The user also has the the predictions from two methods (mCSM and SDM) in a option of downloading the structure of the mutant protein non-linear way, using SVM regression with a Radial Basis generated by the programme ANDANTE (26), as required Function kernel (22). for the method SDM. In order to do so, complementary information regarding the mutation, such as secondary structure (used by SDM) and a pharmacophore vector that accounts for the changes VALIDATION between wild-type and mutant residue (used by mCSM) are Mutation Data sets also calculated and used by DUET. As described previously (15), the pharmacophore vector is obtained by comparing DUET’s regression model was trained on data for muta- the frequency of eight possible atom characteristics between tions derived from the ProTherm database (18) and used wild-type and mutant residues (hydrophobic, positive, neg- in a previous study (15). The training set is formed by 2297 ative, hydrogen acceptor, hydrogen donor, aromatic, sul- randomly selected mutations drawn from the S2848 data set phur and neutral). used by the PoPMuSiC method (13). To minimize the risk As a filtering step, residue relative solvent accessibility of overfitting, two blind test sets were devised to validate (RSA) is used to optimize the standard SDM predictions the method. The first data set was composed of 351 non- using a regression model tree before combining it with redundant mutations at position level, meaning that muta- mCSM. The M5P algorithm (23) was used to generate the tions in a given position are either in the training or test regression tree which improved the SDM performance on set exclusively. More information about the data sets used the blind test from r = 0.56 to r = 0.62. can be found in Section 1 in Supplementary Material. In W316 Nucleic Acids Research, 2014, Vol. 42, Web Server issue Figure 1. DUET workflow for obtaining a consensus prediction for a single point mutation. The grey and the blue boxes denote the server’s input and output, respectively. Green boxes denote intermediate prediction values used by DUET and yellow boxes denote complementary information used to optimize SDM prediction or by DUET. order to perform a comparative test between DUET and RESULTS iStable (16), we used a dataset of mutations on the p53 pro- Figure 3 shows regression analysis for the stability predic- tein, a transcription factor whose loss of function is corre- tions generated by DUET in comparison with the experi- lated with tumourigenesis which was assembled in a previ- mentally measured variation in stability for the considered ous study (15). This data set contained 42 mutations within data sets. During training, DUET achieved a Pearson’s cor- the DNA binding domain of the tumour suppressor p53 relation coefficient of r = 0.74 with a standard error of σ protein with experimentally characterized thermodynamic = 0.98 kcal/mol, significantly better than mCSM ( r = 0.69, effects available in the scientific literature. None of these mu- σ = 1.06 kcal/mol. See Section 2 in Supplementary Ma- tations was present in the training set. terial). Furthermore, a correlation of r = 0.82 with stan- Nucleic Acids Research, 2014, Vol. 42, Web Server issue W317 Figure 2. Result page for DUET prediction. The results display the predicted change in folding free energy upon mutation (G in kcal/mol). A positive value (and red writing) corresponds to a mutation predicted as destabilizing; while a negative sign (and blue writing) corresponds to a mutation predicted as stabilizing. The information displayed include the mCSM (i) and SDM (ii) individually predicted protein stability changes, the combined DUET prediction (iii), a structural summary of the mutation highlighting the wild-type residue and position number, the mutation and its 3D environment (iv). The protein and mutation can also be visualized (v), or a PDB file of the mutant downloaded for viewing in your preferred molecular visualization software. Table 1. Comparative prediction performance of methods on P53 data set a a Method Pearson’s coefficient Standard error kcal/mol mCSM 0.68 / 0.72 1.40 / 1.20 SDM 0.52 / 0.64 1.61 / 1.32 iStable 0.49 / 0.64 1.59 / 1.37 DUET 0.68 / 0.76 1.39 / 1.13 The two values given per column correspond respectively to the whole validation set of 42 mutants and the results after removing 10% of the outliers. Figure 3. Regression analysis between experimental and predicted stability changes by DUET. The left graph show the performance of DUET during training while the right graph shows the predictive performance in two different blind test sets. Pearson’s correlation coefficient ( r) and standard error (σ ) are also shown for each data set. W318 Nucleic Acids Research, 2014, Vol. 42, Web Server issue dard error of σ = 0.72 kcal/mol is obtained after 10% out- the Victorian Government and the Leslie (Les) J. Flem- lier removal. In the first blind test set of 351 non-redundant ing Churchill Fellowship from the The Winston Churchill mutations, DUET achieved a correlation of r = 0.71 (σ = Memorial Trust (to D.B.A.); University of Cambridge and 1.13 kcal/mol, which is considerably higher than the perfor- The Wellcome Trust for facilities and support (093167 to mance of either method individually (r = 0.56 and r = 0.67 T.L.B.). Funding for open access charge: The Wellcome for SDM and mCSM, respectively. See Section 2 in Supple- Trust. mentary Material). The correlation in 90% of the data set Conflict of interest statement. None declared. peaks at r = 0.79 (σ = 0.84 kcal/mol). In order to compare DUET with iStable (16), a recently proposed integrated computational approach, a blind test REFERENCES with p53 mutations was devised. iStable is a meta-predictor 1. Capriotti,E., Nehrt,N.L., Kann,M.G. and Bromberg,Y. (2012) that combines seven different methods using SVM algo- Bioinformatics for personal genome interpretation. Brief Bioinform., rithm, and integrates complementary information such as 13, 495–512. residue solvent accessibility, secondary structure and se- 2. Lahti,J.L., Tang,G.W., Capriotti,E., Liu,T. and Altman,R.B. (2012) quence information. Bioinformatics and variability in drug response: a protein structural Table 1 shows the comparative results between the com- perspective. J. R. Soc. Interface, 9, 1409–1437. 3. Stratton,M.R., Campbell,P.J. and Futreal,P.A. (2012) The cancer putationally integrated approaches DUET and iStable, as genome. Nature, 458, 719–724. well as mCSM and SDM. Even though iStable relies on 4. Hudson,T.J., Anderson,W., Aretz,A., Barker,A.D., Bell,C., the predictions of seven different methods, the approach Bernabe,R.R.M.K., ´ Calvo,F., Eerola,I., Gerhard,D.S. and others achieved a correlation coefficient of only r = 0.49, which is (2012) International network of cancer genome projects. Nature, 464, 993–998. inconsistent with the correlation of r = 0.86 that the authors 5. Casadio,R., Vassura,M., Tiwari,S., Fariselli,P. and Luigi Martelli,P. report during cross-validation. In contrast, DUET achieves (2012) Correlating disease-related mutations to their effect on protein a r = 0.68 (σ = 1.40 kcal/mol), which is consistent with the stability: A large-scale analysis of the human proteome. Hum. Mutat., methods performance during training and blind test valida- 32, 1161–1170. tion. By removing 10% of outliers (only three mutations), 6. Socha,R.D. and Tokuriki,N. (2012) Modulating protein stability–directed evolution strategies for improved protein function. DUET’s correlation coefficient rises to r = 0.77 and stan- FEBS J., 280, 5582–5595. dard error drops to σ = 1.12 kcal/mol, in comparison with 7. Topham,C.M., Srinivasan,N. and Blundell,T.L. (2012) Prediction of a correlation of r = 0.64 (σ = 1.37 kcal/mol) achieved by the stability of protein mutants based on structural iStable. environment-dependent amino acid substitution and propensity tables. Protein Eng., 10, 7–21. 8. Guerois,R., Nielsen,J.E. and Serrano,L. (2012) Predicting changes in CONCLUSIONS the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol., 320, 369–387. DUET is an accurate, free and easy-to-use bioinformatics 9. Capriotti,E., Fariselli,P. and Casadio,R. (2012) A web server created for experts and non-experts alike who are neural-network-based method for predicting protein stability changes interested in gaining insight into the effects of nsSNPs on upon single point mutations. Bioinformatics, 20, i63–i68. 10. Capriotti,E., Fariselli,P. and Casadio,R. (2012) I-Mutant2.0: protein stability. It integrates two complementary methods predicting stability changes upon mutation from the protein sequence into a consensus/optimized prediction, as a way to leverage or structure. Nucleic Acids Res., 33, W306–W310. the best of SDM, a statistical potential energy function that 11. Parthiban,V., Gromiha,M.M. and Schomburg,D. (2012) CUPSAT: relies on substitution tables derived from homologous pro- prediction of protein stability upon point mutations. Nucleic Acids tein families which incorporates constraints on residue envi- Res., 34, W239–W242. 12. Cheng,J., Randall,A. and Baldi,P. (2012) Prediction of protein ronments during evolution, and mCSM, a machine learning stability changes for single-site mutations using support vector algorithm that takes into account the residue 3D phsyco- machines. Proteins, 62, 1125–1132. chemical environment summarized as a graph-based struc- 13. Dehouck,Y., Grosfils,A., Folch,B., Gilis,D., Bogaerts,P. and tural signature. DUET is a valuable tool for a wide variety Rooman,M. (2012) Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural of applications, ranging from protein stability modulation networks: PoPMuSiC-2.0. Bioinformatics, 25, 2537–2543. to understanding the role of mutations in diseases. 14. Worth,C.L., Preissner,R. and Blundell,T.L. (2012) SDM––a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res., 39, W215–W222. SUPPLEMENTARY DATA 15. Pires,D.E.V., Ascher,D.B. and Blundell,T.L. (2012) mCSM: Supplementary Data are available at NAR Online including predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics, 30, 335–342. [1–6]. 16. Chen,C., Lin,J. and Chu,Y. (2012) iStable: off-the-shelf predictor integration for predicting protein stability changes. BMC ACKNOWLEDGEMENT Bioinformatics, 14, S5. 17. Shevade,S.K., Keerthi,S.S., Bhattacharyya,C. and Murthy,K.R.K. The authors thank Harry Jubb and Bernardo Ochoa for (2012) Improvements to the SMO algorithm for SVM regression. helpful discussions and feedback. IEEE Trans. Neural Netw., 11, 1188–1193. 18. Kumar,M.D.S., Bava,K.A., Gromiha,M.M., Prabakaran,P., Kitajima,K., Uedaira,H. and Sarai,A. (2012) ProTherm and ProNIT: FUNDING thermodynamic databases for proteins and protein–nucleic acid interactions. Nucleic Acids Res., 34, D204–D206. Conselho Nacional de Desenvolvimento Cient´ıfico e Tec- 19. da Silveira,C.H., Pires,D.E.V., Melo-Minardi,R.C., Ribeiro,C., nologico ´ (CNPq), Brazil (to D.E.V.P.); NHMRC CJ Mar- Veloso,C.J.M., Lopes,J.C.D., Meira,W. Jr, Neshich,G., tin Fellowship (GNT1072476), Victoria Fellowship from Ramos,C.H.I., Habesch,R. and Santoro,M.M. (2012) Protein cutoff Nucleic Acids Research, 2014, Vol. 42, Web Server issue W319 scanning: A comparative analysis of cutoff dependent and cutoff free 23. Quinlan,J.R. (2012) Learning with continuous classes. In: methods for prospecting contacts in proteins. Proteins, 74, 727–743. Proceedings of the 5th Australian joint Conference on Artificial 20. Pires,D.E.V., Melo-Minardi,R.C., Santos,M.A., da Silveira,C.H., Intelligence, Vol. 92, pp. 343–348. Santoro,M.M. and Meira,W. Jr (2012) Cutoff Scanning Matrix 24. Sugase,K., Dyson,H.J. and Wright,P.E. (2012) Mechanism of coupled (CSM): structural classification and function prediction by protein folding and binding of an intrinsically disordered protein. Nature, inter-residue distance patterns. BMC Genomics, 12, S12. 447, 1021–1025. 21. Pires,D.E.V., de Melo-Minardi,R.C., da Silveira,C.H., Campos,F.F. 25. Lee,B. and Richards,F.M. (2012) The interpretation of protein and Meira,W. Jr (2012) aCSM: noise-free graph-based signatures to structures: estimation of static accessibility. J. Mol. Biol., 55, 379–400. large-scale receptor-based ligand prediction. Bioinformatics, 29, 26. Smith,R.E., Lovell,S.C., Burke,D.F., Montalvao,R.W. and 855–861. Blundell,T.L. (2012) Andante: reducing side-chain rotamer search 22. Scholkopf,B., Sung,K., Burges,C.J.C., Girosi,F., Niyogi,P., Poggio,T. space during comparative modeling using environment-specific and Vapnik,V. (2012) Comparing support vector machines with substitution probabilities. Bioinformatics, 23, 1099–1105. Gaussian kernels to radial basis function classifiers. IEEE Trans. Signal Process., 45, 2758–2765.

Journal

Nucleic Acids Research – Oxford University Press

Published: Jul 1, 2014

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

References (32)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies