Search

Filter

  • Advanced Filters:

  • to
  • Specific Data Sources:

    All Edit

    Select All  |  Select None

Reset filters

Protein Secondary Structure Prediction by Merged Hidden Markov Models Student name: Christian A. C u m b a a (M.Math student in computer science) University o f Waterloo, Waterloo, Ontario, Canada Advisor: Dr. Forbes J. Burkowski http://www.math, uwaterloo, ca/~ccumbaa Project Overview A protein molecule is a linear chain of amino acid residues, which typically folds into a complex, globular shape in its native solvent environment. The protein folding problem is that of determining the native three-dimensional (tertiary) structure of a protein molecule given only its amino acid sequence and its environment. The importance of the protein folding problem springs from the huge amount of genetic sequence data currently available and the many ongoing whole-genome sequencing projects. Determining the shape of a protein whose amino acid sequence is encoded in a gene sequence is an intermediate step on the path to understanding the function of an organism. Knowing the structure of a target protein is also crucial to rational drug design. The protein folding problem is hard. Determining protein structure by experimental observation is an expensive and timeconsuming process. Solving the structure by molecular dynamics simulation is not yet computationally feasible. Machine learning methods have therefore been developed in order to predict tertiary structure, but none are very successful. A simpler, but related, problem is that of predicting protein secondary structure. Within a protein molecule, segments of amino acid residues align into regular substructures such as tx helices, [3 sheets and coils. Secondary structure prediction is the assignment of tx, [3, and coil labels to each residue in a molecule. The best machine learning methods achieve a maximum success rate of about 75%, depending on the similarity of the target protein to proteins with known structure. These methods include sequence alignment, statistical methods, neural networks, and hidden Markov models (HMMs). HMMs are a common tool for biological sequence analysis. A HMM is a probabilistic model that generates sequences by a series of random transitions between internal states and a random emission of sequence units after each transition. We are developing a pattern-based, statistical approach to protein secondary structure prediction. Our model of protein folding assumes that protein structure is governed by short patterns (motifs) in the amino acid sequence. Each pattern exerts a local influence, represented by probability distributions over the structure space, on the underlying structure of a protein molecule. When two or more patterns occur in overlapping regions of a sequence, their structural influences combine to form a new, unified influence (probability distribution). We use a limited form of HMM to model the structural influences. Our prediction method has a preprocessing step and a prediction step. The preprocessing step finds the patterns and calculates their structural influences. To find the patterns, a training set containing protein sequences with known secondary structure is searched using a pattern discovery algorithm. Next, for each pattern, a smaller training set is assembled from the underlying structures at each occurrence of the pattern. This smaller set is used to train a HMM representing the structural influence of the pattern on a protein. The prediction step is applied to a target sequence. The target sequence is searched for occurrences of patterns found in the preprocessing step. The HMMs for each pattern are then combined using a special merging operation developed for this study, to create a single HMM describing the probability distribution over all possible underlying structures for the target sequence. Our current research focuses on theoretical justification of the HMM merging step, selection of Bayesian prior probabilities in the HMM learning step, and methods for eliminating bias from sequences in the Protein Data Bank, the source of our training data. Acknowledgement This research is supported by a grant from Communications and Information Technology Ontario (CITO).

Page 1 of 1

Page 1 of 1

Toggle back to continuous viewing mode

/lp/association-for-computing-machinery/protein-secondary-structure-prediction-by-merged-hidden-markov-models-ewIJTCSwg0
Welcome to DeepDyve! Rent Premier Research Articles and Save Up to 90%

Learn more

Free Article

Bookmark

Protein secondary structure prediction by merged hidden Markov models

Cumbaa, Christian A.
ACM SIGBIO Newsletter , Volume 20 (1)
Association for Computing MachineryApr 1, 2000

More Info

More Like This Article

View All dataSource[]=actageo&dataSource[]=aspet&dataSource[]=aaos&dataSource[]=aacc&dataSource[]=aacr&dataSource[]=aea&dataSource[]=aip&dataSource[]=ajnr&dataSource[]=ams&dataSource[]=aps_physical&dataSource[]=appi_book&dataSource[]=appi_journal&dataSource[]=apha&dataSource[]=asip&dataSource[]=asm&dataSource[]=asn&dataSource[]=aspb&dataSource[]=avs&dataSource[]=annual_reviews&dataSource[]=arxiv&dataSource[]=acm&dataSource[]=berghahn&dataSource[]=cabi&dataSource[]=clinical_trials&dataSource[]=dailymed&dataSource[]=degruyter&dataSource[]=du_press&dataSource[]=esa&dataSource[]=eu_press&dataSource[]=elsevier&dataSource[]=emerald&dataSource[]=ejtr&dataSource[]=emea&dataSource[]=epo&dataSource[]=faseb&dataSource[]=gsa&dataSource[]=health_affairs&dataSource[]=hindawi&dataSource[]=imanager&dataSource[]=imedpub&dataSource[]=informa_healthcare&dataSource[]=informs&dataSource[]=iop&dataSource[]=iucr&dataSource[]=iospress&dataSource[]=jbjs&dataSource[]=leftcoast&dataSource[]=lu_press&dataSource[]=mesharpe&dataSource[]=mary_ann_liebert&dataSource[]=medline&dataSource[]=mit_press&dataSource[]=nature&dataSource[]=oxford&dataSource[]=pier_professional&dataSource[]=pnas&dataSource[]=portlandpress&dataSource[]=psyc_articles&dataSource[]=psyc_books&dataSource[]=psyc_critiques&dataSource[]=plos_journal&dataSource[]=pubmed_central&dataSource[]=rsna&dataSource[]=rockefeller&dataSource[]=rcn&dataSource[]=ria&dataSource[]=rsc&dataSource[]=sage&dataSource[]=spie&dataSource[]=springer_journal&dataSource[]=springer&dataSource[]=taylor_francis&dataSource[]=aps&dataSource[]=the_scientist&dataSource[]=uc_press&dataSource[]=uspto_abstract&dataSource[]=wiley&dataSource[]=pct

Browse: Subject Areas | Journals | Publishers

Sign Up for a DeepDyve Account

Bookmark an Article

To bookmark an article, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Subscribe to Journal Email Alerts

To subscribe to email alerts, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Thank you for renting with DeepDyve

Your PayPal account has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

Your credit card has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

OK

New! You can now keep track of new articles from ACM SIGBIO Newsletter on your personalized homepage! Learn more

PDF Download — Not Available

Thanks for your interest in purchasing the PDF. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for purchasing with DeepDyve

Your PayPal account has been charged $.

Your credit card has been charged $.

You can now download this article. A purchase receipt has also been sent to your email address.

Download This Article or I'm done with my download

Print Page — Not Available

Thanks for your interest in printing individual pages. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for printing with DeepDyve

Your PayPal account has been charged $0.

Your credit card has been charged $0.

You can now print this article. A purchase receipt has also been sent to your email address.

Print the Selected Pages or I'm done with my printing

Please refresh to generate a new download link

Your article download link has expired. Please refresh this page to obtain a new download link and try again.

Follow a Journal

To get new article updates from a journal on your personalized homepage, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK