Access the full text.
Sign up today, get DeepDyve free for 14 days.
Richard Chung, G. Yona (2004)
Protein family comparison using statistical models and predicted structural informationBMC Bioinformatics, 5
Robert Edgar (2004)
MUSCLE: multiple sequence alignment with high accuracy and high throughput.Nucleic acids research, 32 5
(2005)
W294 Nucleic Acids Research
K. Katoh, K. Misawa, K. Kuma, T. Miyata (2002)
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.Nucleic acids research, 30 14
K. Ginalski, M. Grotthuss, N. Grishin, L. Rychlewski (2004)
Detecting distant homology with Meta-BASICNucleic acids research, 32 Web Server issue
C. Sander, R. Schneider (1991)
Database of homology‐derived protein structures and the structural meaning of sequence alignmentProteins: Structure, 9
V. Simossis, J. Kleinjung, J. Heringa (2003)
An overview of multiple sequence alignment.Current protocols in bioinformatics, Chapter 3
C. Notredame, D. Higgins, J. Heringa (2000)
T-Coffee: A novel method for fast and accurate multiple sequence alignment.Journal of molecular biology, 302 1
James Cuff, G. Barton (2000)
Application of multiple sequence alignment profiles to improve protein secondary structure predictionProteins: Structure, 40
V. Simossis, J. Heringa (2003)
The PRALINE online server: optimising progressive multiple alignment on the webComputational biology and chemistry, 27 4-5
A. Bateman, W. Pearson, L. Stein, G. Stormo, J. Yates (2002)
Current Protocols in Bioinformatics
V. Simossis, J. Kleinjung, J. Heringa (2005)
Homology-extended sequence alignmentNucleic Acids Research, 33
M. Dayhoff, W. Barker, L. Hunt (1983)
Establishing homologies in protein sequences.Methods in enzymology, 91
Guoli Wang, Roland Dunbrack (2004)
Scoring profile‐to‐profile sequence alignmentsProtein Science, 13
D. Frishman, P. Argos (1997)
Seventy‐five percent accuracy in protein secondary structure predictionProteins: Structure, 27
R. Lüthy, A. McLachlan, D. Eisenberg (1991)
Secondary structure‐based profiles: Use of structure‐conserving scoring tables in searching protein sequence databases for structural similaritiesProteins: Structure, 10
D. Frishman, Patrick Argos (1996)
Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence.Protein engineering, 9 2
Niklas Öhsen, Ingolf Sommer, R. Zimmer, Thomas Lengauer (2004)
Arby: automatic protein structure prediction using profile-profile alignment and confidence measuresBioinformatics, 20 14
Robert Edgar, Kimmen Sjölander (2004)
A comparison of scoring functions for protein sequence profile alignmentBioinformatics, 20 8
C. Chothia, A. Lesk (1986)
The relation between the divergence of sequence and structure in proteins.The EMBO Journal, 5
K. Mizuguchi, C. Deane, T. Blundell, John Overington (1998)
HOMSTRAD: A database of protein structure alignments for homologous familiesProtein Science, 7
J. Heringa (2002)
Local Weighting Schemes for Protein Multiple Sequence AlignmentComputers & chemistry, 26 5
V. Simossis, J. Heringa
The Praline Online Server: Optimising Progressive Multiple Alignment on the Web
S. Altschul, Thomas Madden, A. Schäffer, Jinghui Zhang, Zheng Zhang, W. Miller, D. Lipman (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic acids research, 25 17
V. Simossis, J. Heringa (2004)
The influence of gapped positions in multiple sequence alignments on secondary structure prediction methodsComputational biology and chemistry, 28 5-6
J. Heringa (1999)
Two Strategies for Sequence Comparison: Profile-preprocessed and Secondary Structure-induced Multiple AlignmentComputers & chemistry, 23 3-4
S. Altschul, E. Koonin (1998)
Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases.Trends in biochemical sciences, 23 11
Kuang Lin, V. Simossis, W. Taylor, J. Heringa (2005)
A simple and fast secondary structure prediction method using hidden neural networksBioinformatics, 21 2
J. Stegeman, P. Montellano (1986)
Cytochrome P-450: Structure, Mechanism, and Biochemistry
B. Rost (1999)
Twilight zone of protein sequence alignments.Protein engineering, 12 2
K. Ginalski, J. Pas, L. Wyrwicz, M. Grotthuss, J. Bujnicki, L. Rychlewski (2003)
ORFeus: detection of distant homology using sequence profiles and predicted secondary structureNucleic acids research, 31 13
W. Kabsch, C. Sander (1983)
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 22
Dariusz Przybylski, B. Rost (2002)
Alignments grow, secondary structure prediction improvesProteins: Structure, 46
J. Heringa (2000)
Computational methods for protein secondary structure prediction using multiple sequence alignments.Current protein & peptide science, 1 3
W. Pearson (2000)
Flexible sequence similarity searching with the FASTA3 program package.Methods in molecular biology, 132
W. Pryor (1996)
Cytochrome P450: Structure, mechanism, and biochemistryFree Radical Biology and Medicine, 21
K. Katoh, K. Kuma, H. Toh, T. Miyata (2005)
MAFFT version 5: improvement in accuracy of multiple sequence alignmentNucleic Acids Research, 33
V. Simossis, J. Heringa (2004)
Integrating protein secondary structure prediction and multiple sequence alignment.Current protein & peptide science, 5 4
G. Pollastri, Darisz Przybylski, B. Rost, P. Baldi (2002)
Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profilesProteins: Structure, 47
David Jones (1999)
Protein secondary structure prediction based on position-specific scoring matrices.Journal of molecular biology, 292 2
J. Söding (2005)
Protein homology detection by HMM?CHMM comparisonBioinformatics, 21 7
Nucleic Acids Research, 2005, Vol. 33, Web Server issue W289–W294 doi:10.1093/nar/gki390 PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information 1 1,2, V. A. Simossis and J. Heringa * 1 2 Bioinformatics Section, Faculty of Sciences and Centre for Integrative Bioinformatics VU (IBIVU), Faculty of Sciences and Faculty of Earth & Life Sciences, Vrije Universiteit, De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands Received February 11, 2005; Revised and Accepted March 10, 2005 State-of-the-art multiple sequence alignment (MSA) methods, ABSTRACT such as T-COFFEE (1) and MUSCLE (2), as well as other PRofile ALIgNEment (PRALINE) is a fully customizable MSA methods available to date, perform alignments by only multiple sequence alignment application. In addition using the sequences in the given set. Although they use profile to a number of available alignment strategies, technology to match distant sequence sets, they do not use PRALINE can integrate information from database further homology information for the sequences that are avail- homology searches to generate a homology- able in current sequence databases. The benefit of using homo- logous information to align distant sequences has been shown extended multiple alignment. PRALINE also provides in a number of studies (3), while the use of profiles to represent a choice of seven different secondary structure predic- the additional homologous information has been shown to tion programs that can be used individually or in com- have many advantages (4,5). For this reason, the PRALINE bination as a consensus for integrating structural toolbox (6,7) has been recently re-designed to include information into the alignment process. The program homology-extended multiple alignment (8), where as an initial can be used through two separate interfaces: one has step a profile for each sequence in a given set is built by using been designed to cater to more advanced needs of PSI-BLAST (9,10) and the progressive alignment then pro- researchers in the field, and the other for standard ceeds using the PSI-BLAST profiles instead of the given construction of high confidence alignments. The web- sequences. This approach has been previously applied with based output is designed to facilitate the comprehens- success to local pairwise alignment methods for homology ive visualization of the generated alignments by means modelling (11–15) and is extended in PRALINE for global of five default colour schemes based on: residue type, MSA. The recently updated MAFFT alignment tool (3,16) also uses homologous sequences to improve the alignment quality position conservation, position reliability, residue of distant sequences. However, in the MAFFT approach, the hydrophobicity and secondary structure, depending additional information is not incorporated in profiles for each on the options set. A user can also define a custom of the query sequences, but homologous sequences are added colour scheme by selecting which colour will represent to the original set and then aligned together using the various one or more amino acids in the alignment. All generated MAFFT alignment strategies. In the end, the homologous alignments are also made available in the PDF format sequences are removed, leaving the aligned original sequences for easy figure generation for publications. The group- to form the final alignment. ing of sequences, on which the alignment is based, can In this paper we present the new web server for the also be visualized as a dendrogram. PRALINE is avail- PRALINE toolbox (6,7), where we have added two new align- able at http://ibivu.cs.vu.nl/programs/pralinewww/. ment features: homology-extended multiple alignment (8) and the integration of predicted secondary structure information with iteration capabilities (V. A. Simossis and J. Heringa, submitted for publication). We show results for the cyto- INTRODUCTION chrome P450 HOMSTRAD (17) sequence set as an example The alignment of two or more sequences has become an to demonstrate how the homology-extended strategy and essential sequence analysis technique in biological research. integrating secondary structure information, in combination *To whom correspondence should be addressed. Tel: +31 20 5987649; Fax: +31 20 5987653; Email: [email protected] ª The Author 2005. Published by Oxford University Press. All rights reserved. The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact [email protected] W290 Nucleic Acids Research, 2005, Vol. 33, Web Server issue with the visualization possibilities of the server output can lead section. As shown in Table 1, the improvement in alignment to meaningful interpretations. Details about the PRALINE quality achieved by homology-extended alignment strategies and optimizations have been described previously (PRALINE ) as compared with other methods is significant PSI (6–8,18). in the more difficult alignment cases with average sequence identity percentages <60%. As would be expected, in the easier alignment cases that share >60% sequence identity, HOMOLOGY-EXTENDED MULTIPLE ALIGNMENT all the alignments are of comparable high quality. The homology-extended MSA strategy enriches the informa- When used as an option on the server, the homology- tion for each of the sequences in a given set by collecting extended alignment strategy can further be customized by putative homologous sequences. Each sequence is submitted manually entering the desired iteration count, starting as a query to PSI-BLAST over a database of choice [default: E-value cut-off and database to be searched by PSI-BLAST non-redundant (NR)]. The resulting PSI-BLAST alignments for the building of the homology-extended profiles (default: are then filtered for redundancy (100% sequence identity). In 3 iterations, starting with a cut-off of 10 · 10 on the NR the event that no hits or only redundant hits are detected, the database). The default parameters have been optimized by PSI-BLAST E-value threshold is automatically adjusted to a testing different settings on the HOMSTRAD database of 6 5 10-fold less stringent setting (e.g. from 10 · 10 to 10 · 10 ) structural alignments (8). and the query is re-submitted. Once all the sequences to be aligned have at least one additional putative homologue, each INTEGRATION OF SECONDARY STRUCTURE PSI-BLAST alignment is converted into a profile and pro- gressively aligned. A more detailed account of the PRALINE The rule-of-thumb that structure is more conserved than homology-extended multiple alignment algorithm and its sequence is a well-documented fact (21–24). As a result, performance is available in Ref. (8). many studies have shown that its use to guide sequence align- The advantage of this strategy is that it uses a much larger ment improves alignment quality, especially between distant amount of position-specific information in the homology- sequences (6–8,11–15,25). To this end, we have devised a extended profiles to score the alignment of two or more posi- secondary structure scoring scheme for the alignment algo- tions. As a result, the cases that benefit the most are those that rithm that combines exchange weights from four types of evolution has changed so extensively (<30% identity) that matrices: sequence or profile positions that have not been the homology (common ancestry) between them is almost assigned the same secondary structure class are scored undetectable when compared directly (8). using a generic matrix (default: BLOSUM62), otherwise the In Table 1, the performance of the homology-extended positions that have matching helix, strand or coil assignments alignment strategy on 254 HOMSTRAD (17) multiple align- use the Lu ¨ thy (26) helix-, strand- and coil-specific matrices, ment cases has been compared with the state-of-the-art meth- respectively. The use of the secondary structure information ods T-COFFEEv2.03 and MUSCLEv3.51. The results show significantly improves the PRALINE alignment quality BASIC that for the strictest quality measure, column scoring, the and also boosts the PRALINE alignments in the very dif- PSI overall improvement of the PRALINE strategy is >3.5% ficult alignment cases <20% sequence identity (V. A. Simossis PSI relative to T-COFFEE and MUSCLE. Moreover, the improve- and J. Heringa, submitted for publication). In Table 1, it is ment is >5% for the most distant and difficult test cases with clearly shown that the use of the secondary structure is bene- sequences <30% sequence identity. In addition, PRALINE ficial for PRALINE (>4% improvement in cases with PSI BASIC has also been compared with the PRALINE standard global <60% identity), albeit not as significant as the improvements progressive alignment strategy (PRALINE ) (6) and the seen with PRALINE . BASIC PSI PRALINE and PRALINE strategies with integrated The secondary structure integration options of PRALINE BASIC PSI predicted [PSIPRED (19) and YASPIN (20)] secondary struc- involve the use of any one of the seven prediction methods that ture information, respectively, named as PRALINE , are listed [PHDpsi (27), PROFsec (B. Rost, unpublished data), BASIC-PSIPRED PRALINE , PRALINE and SSPRO 2.01 (28), YASPIN (20), PSIPRED (19), JNET (29) BASIC-YASPIN PSI-PSIPRED PRALINE . The latter secondary structure-guided and PREDATOR (30,31)] to predict the secondary structure of PSI-YASPIN alignment strategies of PRALINE are discussed in the next the input sequences. In addition, the user can optionally select Table 1. The quality assessment of 254 HOMSTRAD multiple alignment cases generated by different alignment strategies Alignment method Overall (%) 0–30 (%) 30–60 (%) 60–100 (%) P (0–100) Column score PRALINE 63.8 38.7 68.5 95.5 – BASIC PRALINE 68.0 45.3 72.2 96.3 0.106 BASIC-YASPIN PRALINE 67.4 43.5 72.1 95.9 0.337 BASIC-PSIPRED PRALINE 70.2 50.2 73.6 96.7 0.025 PSI PRALINE 70.0 49.7 73.6 96.5 0.042 PSI-YASPIN PRALINE 70.1 50.2 73.5 96.7 0.014 PSI-PSIPRED TCOFFEEv2.03 67.6 44.0 72.2 95.8 0.237 MUSCLEv3.51 67.5 45.0 71.6 96.3 0.461 The significance of the results (P-value from Kolmogorov–Smirnov test) is calculated with regard to the PRALINE method. The column scores are the BASIC percentage correctly aligned columns with regard to the HOMSTRAD structure alignment. Nucleic Acids Research, 2005, Vol. 33, Web Server issue W291 to also search the Protein Data Bank (PDB) to find 3D structure THE NEW PRALINE SERVER information for the input sequences and use the DSSP-derived The PRALINE program is designed to use two or more input secondary structure for the alignment. If both DSSP and a protein sequences in the FASTA format (34). The proposed prediction method are selected, the predictions will only be maximum number of sequences that should be submitted to the integrated into the alignment for those sequences that do not server is set to 500 with length 2000, but this is mainly to limit have a PDB entry. Finally, in the same list as the seven pre- the server load and is not the limit of the PRALINE program. diction methods, an optimally segmented (24) or majority In addition, owing to the long running time needed for strat- voting consensus can be alternatively used that currently com- egies, such as PRALINE , an optional email notification can PSI bines the predictions of PROFsec, YASPIN and PSIPRED. be requested that is delivered upon a completion of the job and contains the link to the results and some statistics on the resulting alignment. PROFILE PRE-PROCESSING AND ITERATION Similar to the previous version of the server (18), the gap opening and gap extension penalties and the amino acid sub- PRALINE provides a number of alignment strategies, such as stitution matrix can be manually set if needed [default: 12, 1 profile pre-processing and iterative alignment optimization with BLOSUM62 (35)] for any of the PRALINE alignment (6,7). The secondary structure-guided strategies using PHD, strategies. The results page is automatically displayed once the PROFsec, JNET and SSPRO, and the profile pre-processing job is complete and contains various sections depending on strategies can be set to use consistency information to drive the options selected (Figure 1). In order to provide all gener- subsequent alignment rounds (iterations), each time drawing ated files for the user, there is a link to download a compressed upon the theoretically higher quality information from the file with all the results in the job directory [Figure 1, (D)] and previous cycle. A detailed account of these strategies can also individual links that allow the user to download specific be found in previously published work (6,7,18,25,32,33). Figure 1. The PRALINE results page headers. A: The subtitle indicating which iteration results are presented on this page (only available if iteration >0 is selected). B: The time taken to run the job and statistics related to the visible alignment. C: The links to all other available iteration cycle results (only available if iteration >0is selected). D: The link to download all job files as a compressed file. E: Links to tabulated specific file types. F: Links to iteration-specific output files (only available if iteration >0 is selected). G: The button that hides/reveals the profile pre-processing scores of the sequence set (only available if profile pre-processing is selected). H: The buttons that switch between colour schemes. I: The button that generates and opens a PDF version of the alignment in the visible colour scheme. W292 Nucleic Acids Research, 2005, Vol. 33, Web Server issue Figure 2. The PRALINE P450 alignment using both PROFsec and DSSP secondary structure integration settings. The alignment has been sectioned to focus on the PSI regions containing the conserved motifs of the cytochrome P450 enzymes (signified by the black bars above the rulers). (A) The oxygen-binding motif, (B) the ExxR motif and (C) the haem-binding motif. For each section, the top colour scheme shows conservation levels according to the colour key and the bottom one shows the secondary structure each residue belongs to (red: helix; green: strand; and clear: coil). The ruler on top of each alignment block shows which parts of the alignment are visible. Nucleic Acids Research, 2005, Vol. 33, Web Server issue W293 files related to each sequence in the set (e.g. a PSI-BLAST are straightforwardly visualized in the PRALINE output con- profile or a secondary structure file) [Figure 1, (E)]. servation colour scheme, while the secondary structure view If the iteration number selected is >0, a subtitle informs the allows us to relate them in a structural context. As stated in the user which iteration cycle results are presented on the page literature (37), the oxygen binding and ExxR motifs are each [Figure 1, (A)]. The alignment from each iteration cycle is part of two distinct C-terminal helices, while the haem-binding presented on a different page and is accessible by the corres- motif flanks the N-terminal end of the last helix. Owing to ponding links [Figure 1, (C)]. In addition, it informs the user of space limitations the alignment has been sectioned to concen- the total time taken for the process to complete, provides some trate on these regions, but the full alignment can be viewed statistics related to the visible alignment [Figure 1, (B)] and if online in example 9 of the supplementary material. the iterations were halted due to alignment convergence or limit cycle convergence and which iteration was the last (not applicable in the Figure 1 example). In the case of iteration- ACKNOWLEDGEMENTS specific output, such as alignment of the iteration or secondary structure prediction, additional links are displayed The authors would like to thank the Vrije Universiteit [Figure 1, (F)]. Amsterdam for funding this project. Special thanks are also If profile pre-processing is selected the user has the option of due to Drs Franca Fraternali, Jens Kleinjung and John Romein viewing the profile pre-processing scores for all pairwise align- for help with debugging and server testing. Funding to pay the ments for deriving an optimum cut-off value [Figure 1, (G)]. Open Access publication charges of this article was provided Finally, depending on the selected parameters of the job, a by the Vrije Universiteit Amsterdam. series of buttons allows switching between the available Conflict of interest statement. None declared. colour-coded views [Figure 1, (H)] [details about the colour schemes are described in (18)]. At any point, the visible alignment can be converted into a PDF for printing or further manipulation [Figure 1, (I)]. The remaining of the results page REFERENCES consists of a short description of the visible colour scheme 1. Notredame,C., Higgins,D.G. and Heringa,J. (2000) T-Coffee: a novel with a key to the colours, after which the colour-coded align- method for fast and accurate multiple sequence alignment. J. Mol. Biol., ment follows (an example of the conservation and the second- 302, 205–217. ary structure colour-coding is shown in Figure 2). 2. Edgar,R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res., 32, 1792–1797. 3. Katoh,K., Kuma,K., Toh,H. and Miyata,T. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids SAMPLE OUTPUTS Res., 33, 511–518. 4. Wang,G. and Dunbrack,R.L.,Jr (2004) Scoring profile-to-profile Owing to the large number of possible outputs, we have pro- sequence alignments. Protein Sci., 13, 1612–1626. vided a set of nine representative sample outputs for the P450 5. Edgar,R.C. and Sjolander,K. (2004) A comparison of scoring functions for protein sequence profile alignment. Bioinformatics, 20, 1301–1308. alignment on the server, each one representing a different 6. Heringa,J. (1999) Two strategies for sequence comparison: combination of PRALINE strategies and settings. These profile-preprocessed and secondary structure-induced multiple examples are intended as supplementary material to this alignment. Comput. Chem., 23, 341–364. article and can be accessed through a dedicated link on the 7. Heringa,J. (2002) Local weighting schemes for protein multiple sequence alignment. Comput. Chem., 26, 459–477. server pages or directly at http://ibivu.cs.vu.nl/programs/ 8. Simossis,V.A., Kleinjung,J. and Heringa,J. (2005) Homology-extended pralinewww/example/. They can also be used as an indication sequence alignment. Nucleic Acids Res., 33, 816–824. of CPU times needed by each of the PRALINE strategies. 9. Altschul,S.F. and Koonin,E.V. (1998) Iterated profile searches with In Figure 2, we illustrate sections of the PRALINE align- PSI PSI-BLAST—a tool for discovery in protein databases. Trends Biochem. ment of the ‘p450’ HOMSTRAD sequence set (21% average Sci., 23, 444–447. 10. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., sequence identity) using both DSSP (36) and PROFsec sec- Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a ondary structure integration settings. The colour schemes in new generation of protein database search programs. Nucleic Acids Res., the figure are for positional conservation and secondary struc- 25, 3389–3402. ture. The secondary structure information for each sequence in 11. Chung,R. and Yona,G. (2004) Protein family comparison using statistical models and predicted structural information. BMC Bioinformatics, 5, this alignment has been derived by using DSSP, since all the sequences have a corresponding PDB structure. 12. Ginalski,K., Pas,J., Wyrwicz,L.S., von Grotthuss,M., Bujnicki,J.M. and The cytochrome P450 enzymes primarily act as oxidases in Rychlewski,L. (2003) ORFeus: detection of distant homology using multi-component electron transport chains to break down nat- sequence profiles and predicted secondary structure. Nucleic Acids Res., urally occurring toxins and mutagens. The structure is almost 31, 3804–3807. 13. Ginalski,K., von Grotthuss,M., Grishin,N.V. and Rychlewski,L. (2004) triangular, with the C-terminal part being mostly helical, while Detecting distant homology with Meta-BASIC. Nucleic Acids Res., the N-terminal part is more b-sheet rich. The signature motif of 32, W576–W581. P450 enzymes is the haem-binding site, which is often rep- 14. Soding,J. (2004) Protein homology detection by HMM–HMM resented as FxxGxxxCxG (Figure 2C). Other conserved regi- comparison. Bioinformatics, 21, 951–960. 15. von Ohsen,N., Sommer,I., Zimmer,R. and Lengauer,T. (2004) Arby: ons include the motif A(A/G)x(E/D)T (Figure 2A) where the automatic protein structure prediction using profile–profile alignment threonine (T) residue is part of the oxygen-binding site and an and confidence measures. Bioinformatics, 20, 2228–2235. invariant ExxR sequence (Figure 2B). The ExxR and the C 16. Katoh,K., Misawa,K., Kuma,K. and Miyata,T. (2002) MAFFT: a novel residue at the haem-binding site are the only completely con- method for rapid multiple sequence alignment based on fast Fourier served amino acids in P450s. These well-documented details transform. Nucleic Acids Res., 30, 3059–3066. W294 Nucleic Acids Research, 2005, Vol. 33, Web Server issue 17. Mizuguchi,K., Deane,C.M., Blundell,T.L. and Overington,J.P. (1998) 27. Przybylski,D. and Rost,B. (2002) Alignments grow, secondary structure HOMSTRAD: a database of protein structure alignments for homologous prediction improves. Proteins, 46, 197–205. families. Protein Sci., 7, 2469–2471. 28. Pollastri,G., Przybylski,D., Rost,B. and Baldi,P. (2002) Improving 18. Simossis,V.A. and Heringa,J. (2003) The PRALINE online server: the prediction of protein secondary structure in three and eight optimising progressive multiple alignment on the web. Comput. Biol. classes using recurrent neural networks and profiles. Proteins, 47, Chem., 27, 511–519. 228–235. 19. Jones,D.T. (1999) Protein secondary structure prediction based on 29. Cuff,J.A. and Barton,G.J. (2000) Application of multiple sequence position-specific scoring matrices. J. Mol. Biol., 292, alignment profiles to improve protein secondary structure prediction. 195–202. Proteins, 40, 502–511. 20. Lin,K., Simossis,V.A., Taylor,W.R. and Heringa,J. (2005) A simple and 30. Frishman,D. and Argos,P. (1996) Incorporation of non-local interactions fast secondary structure prediction method using hidden neural in protein secondary structure prediction from the amino acid sequence. networks. Bioinformatics, 21, 152–159. Protein Eng., 9, 133–142. 21. Chothia,C. and Lesk,A.M. (1986) The relation between the divergence of 31. Frishman,D. and Argos,P. (1997) Seventy-five percent accuracy in sequence and structure in proteins. EMBO J., 5, 823–826. protein secondary structure prediction. Proteins, 27, 329–335. 22. Rost,B. (1999) Twilight zone of protein sequence alignments. Protein 32. Simossis,V.A. and Heringa,J. (2004) Integrating protein secondary Eng., 12, 85–94. structure prediction and multiple sequence alignment. Curr. Protein 23. Sander,C. and Schneider,R. (1991) Database of homology-derived Pept. Sci., 5, 249–266. protein structures and the structural meaning of sequence alignment. 33. Simossis,V.A., Kleinjung,J. and Heringa,J. (2003) An overview of Proteins, 9, 56–68. multiple sequence alignment. In Baxevanis,A.D. (ed.), Current Protocols 24. Simossis,V.A. and Heringa,J. (2004) The influence of gapped positions in in Bioinformatics.. John Wiley, NY, pp. 3.7.1–3.7.25. multiple sequence alignments on secondary structure prediction 34. Pearson,W.R. (2000) Flexible sequence similarity searching with the methods. Comput. Biol. Chem., 28, 351–366. FASTA3 program package. Methods Mol. Biol., 132, 185–219. 25. Heringa,J. (2000) Computational methods for protein secondary structure 35. Dayhoff,M.O., Barker,W.C. and Hunt,L.T. (1983) Establishing prediction using multiple sequence alignments. Curr. Protein Pept. homologies in protein sequences. Methods Enzymol., 91, 524–545. Sci., 1, 273–301. 36. Kabsch,W. and Sander,C. (1983) Dictionary of protein secondary 26. Lu ¨ thy,R., McLachlan,A.D. and Eisenberg,D. (1991) Secondary structure: pattern recognition of hydrogen-bonded and geometrical structure-based profiles: use of structure-conserving scoring tables in features. Biopolymers, 22, 2577–2637. searching protein sequence databases for structural similarities. Proteins, 37. In Ortiz de Montellano,P.R. (ed.), Cytochrome P450: Structure, 10, 229–239. Mechanism, and Biochemistry, 2nd edn. Plenum Press, NY.
Nucleic Acids Research – Oxford University Press
Published: Jul 1, 2005
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.