Access the full text.
Sign up today, get DeepDyve free for 14 days.
Steffen Möller, E. Kriventseva, R. Apweiler (2000)
A collection of well characterised integral membrane proteinsBioinformatics, 16 12
S. Altschul, Thomas Madden, A. Schäffer, Jinghui Zhang, Zheng Zhang, W. Miller, D. Lipman (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic acids research, 25 17
M. Sansom, Kathryn Scott, P. Bond (2008)
Coarse-grained simulation: a high-throughput computational approach to membrane proteins.Biochemical Society transactions, 36 Pt 1
Timothy Nugent, David Jones (2010)
Predicting Transmembrane Helix Packing Arrangements using Residue Contacts and a Force-Directed AlgorithmPLoS Computational Biology, 6
Timothy Nugent, David Jones (2009)
Transmembrane protein topology prediction using support vector machinesBMC Bioinformatics, 10
David Jones (2007)
Improving the accuracy of transmembrane protein topology prediction using evolutionary informationBioinformatics, 23 5
Vol. 27 no. 10 2011, pages 1438–1439 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btr096 Structural bioinformatics Advance Access publication February 23, 2011 The MEMPACK alpha-helical transmembrane protein structure prediction server ∗ ∗ Timothy Nugent , Sean Ward and David T. Jones Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK Associate Editor: Burkhard Rost ABSTRACT the most likely topologies returned by overall likelihood and are also capable of predicting the presence of signal peptides and, in the case of Motivation: The experimental difficulties of alpha-helical MEMSAT-SVM, reentrant helices—membrane penetrating helices that enter transmembrane protein structure determination make this class of and exit the membrane on the same side, common in many ion channel protein an important target for sequence-based structure prediction families. The methods were trained using PSI-BLAST (Altschul et al., 1997) tools. The MEMPACK prediction server allows users to submit profile data generated from the Möller dataset (Möller et al., 2000), in the a transmembrane protein sequence and returns transmembrane case of MEMSAT3, or a crystal structure-based training set, in the case of topology, lipid exposure, residue contacts, helix–helix interactions MEMSAT-SVM, and achieved maximum topology prediction accuracies of and helical packing arrangement predictions in both plain text and 78% (Möller set) and 89% (crystal structure set) when fully cross-validated graphical formats using a number of novel machine learning-based using a jack knife test. The higher fraction of eukaryotic sequences in the algorithms. Möller set compared with the relative bias toward prokaryotic sequences in the crystal structure set suggest that the strong performance of these two Availability: The server can be accessed as a new component of methods makes their combination ideally suited to whole-genome annotation the PSIPRED portal by at http://bioinf.cs.ucl.ac.uk/psipred/. of alpha-helical TM proteins. Contact: d.jones@cs.ucl.ac.uk; t.nugent@cs.ucl.ac.uk Received on November 25, 2010; revised on January 27, 2011; 3 PREDICTION OF THE OPTIMAL HELICAL accepted on February 17, 2011 PACKING ARRANGEMENT Despite significant efforts to predict TM protein topology, 1 INTRODUCTION comparatively little attention has been directed toward developing a Given the biological and pharmacological importance of method to help users determine possible 3D packing arrangements transmembrane (TM) proteins and the difficulties associated for helices. Our novel tool MEMPACK (Nugent and Jones, 2009b) with obtaining their crystal structures, the use of bioinformatics uses a range of features to predict residue contacts and helix–helix approaches to direct experimental work while furthering our interactions before using this information to predict the optimal understanding of their structure and function is essential. The helical packing arrangement. First, an SVM classifier, trained using MEMPACK prediction server applies a selection of machine lipid exposed residue profiles labelled according to molecular learning-based tools to predict TM topology—the total number dynamics simulation data (Sansom et al., 2008), is used to predict of TM helices, their boundaries and in/out orientation relative per residue lipid exposure. This information is then combined with to the membrane—with the addition of lipid exposure, residue PSI-BLAST profile data for each interacting residue and additional contacts, helix–helix interactions, culminating in prediction of sequence-based features as input data for an SVM to predict the optimal helical packing arrangement using a force-directed residue contacts. Combining these results with predicted topology algorithm. Figure 1 provides an example of some of the server information, helix–helix interactions can then be predicted and used output. The underlying tools have recently been shown to provide to optimally arrange the helices using a graph-based approach. significant improvements in prediction accuracy compared with By employing a force-directed algorithm, the method attempts to existing methods. It is hoped that this service will be of benefit to minimize edge crossing while maintaining uniform edge length, the broader scientific community. attributes common in native structures. Finally, a genetic algorithm is used to rotate helices in order to prevent residue contacts occurring across the longitudinal helix axis. Under stringent cross-validation 2 METHODS on a non-redundant test set of 74 protein chains, the method In order to predict TM protein topology, the server employs the MEMSAT3 achieved 70% lipid exposure and 67% helix–helix interaction (Jones, 2007) and MEMSAT-SVM (Nugent and Jones, 2009a) methods prediction accuracy—both significant improvements over existing which are based on neural network and SVM classifiers, respectively. methods—and was able to produce a helical packing arrangement Both methods use a dynamic programming algorithm to return a list of which closely resembled a 2D slice taken from the crystal structure approximately normal to the likely plane of the lipid bilayer in 14 To whom correspondence should be addressed. 1438 © The Author 2011. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com [15:41 19/4/2011 Bioinformatics-btr096.tex] Page: 1438 1438–1439 Alpha-helical transmembrane protein structure out of 23 cases, where all helix–helix interactions were successfully predicted. Of the remaining 51 cases, 34 were partially predicted while 17 had no predicted interactions, highlighting the challenges that remain for helix–helix interaction prediction in TM proteins. Funding: Part of this work was supported by the BioSapiens project, which is funded by the European Commission within its FP6 Programme, under the thematic area ‘Life sciences, genomics and biotechnology for health’ (contract number LSHG-CT-2003- 503265). Funding was also provided by the Biotechnology and Biological Sciences Research Council and the Wellcome Trust (grant number GR066745MA). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the article. Conflict of Interest: none declared. REFERENCES Altschul,S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. Jones,D.T. (2007) Improving the accuracy of transmembrane protein topology prediction using evolutionary information. Bioinformatics, 23, 538–544. Möller,S. et al. (2000) A collection of well characterised integral membrane proteins. Bioinformatics, 16, 1159–1160. Nugent,T. and Jones,D.T. (2009a) Transmembrane protein topology prediction using support vector machines. BMC Bioinformatics, 10, 159. Nugent,T. and Jones,D.T. (2009b) Predicting transmembrane helix packing arrangements using residue contacts and a force-directed algorithm. PLoS Comput. Biol., 6, e1000714. Sansom,M.S. et al. (2008) Coarse-grained simulation: a high-throughput computational approach to membrane proteins. Biochem Soc. Trans., 36, 27–32. Fig. 1. Sample output for Archaerhodopsin-1, showing predicted transmembrane regions via MEMSAT and MEMSAT-SVM, the MEMSAT- SVM helix orientation cartoon and the predicted helical packing arrangement from MEMPACK. The plots underneath the schematic topology diagram show the raw scores generated by the SVMs that distinguish between TM helices and loop regions (H/L), inside loops and outside loops (iL/oL), reentrant loops or non-reentrant loops (RE/!RE) and signal peptides or non-signal peptides (SP/!SP). Colors in the MEMPACK cartoon indicate hydrophobic residues (blue), polar residues (red) and charged residues (green for negative, purple for positive). Lines between residues indicate a predicted interaction. [15:41 19/4/2011 Bioinformatics-btr096.tex] Page: 1439 1438–1439
Bioinformatics – Oxford University Press
Published: Feb 23, 2011
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.