Access the full text.
Sign up today, get DeepDyve free for 14 days.
C. O’Donovan, M. Martin, Alexandre Gattiker, E. Gasteiger, A. Bairoch, R. Apweiler (2002)
High-quality Protein Knowledge Resource: SWISS-PROT and TrEMBLBriefings in bioinformatics, 3 3
S. Altschul, Thomas Madden, A. Schäffer, Jinghui Zhang, Zheng Zhang, W. Miller, D. Lipman (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic acids research, 25 17
C. Leslie, E. Eskin, William Noble (2001)
The Spectrum Kernel: A String Kernel for SVM Protein ClassificationPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Amir Atiya (2005)
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and BeyondIEEE Transactions on Neural Networks, 16
S. Brenner, P. Koehl, M. Levitt (2000)
The ASTRAL compendium for protein structure and sequence analysisNucleic acids research, 28 1
Li Liao, William Noble (2002)
Combining pairwise sequence similarity and support vector machines for remote protein homology detection
D. Boswell (2002)
Introduction to Support Vector Machines
C. Leslie, E. Eskin, J. Weston, William Noble (2002)
Mismatch String Kernels for SVM Protein Classification
S. Vishwanathan, Alex Smola (2002)
Fast Kernels for String and Tree Matching
J. Egan (1975)
Signal detection theory and ROC analysis
Jimmy Huang, D. Brutlag (2001)
The EMOTIF databaseNucleic acids research, 29 1
T. Jaakkola, M. Diekhans, D. Haussler (1999)
Using the Fisher Kernel Method to Detect Remote Protein HomologiesProceedings. International Conference on Intelligent Systems for Molecular Biology
S. Henikoff, J. Henikoff, S. Pietrokovski (1999)
Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilationsBioinformatics, 15 6
Temple Smith, M. Waterman (1981)
Identification of common molecular subsequences.Journal of molecular biology, 147 1
J. Kennedy, L. Lloyd (1994)
Enzyme nomenclature — Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology: Academic Press Ltd, London, UK, 1992. xiii + 862 pp. Price £40.00. ISBN 0-12-227165-3Carbohydrate Polymers, 23
(2003)
eBLOCKS : an automated database of protein conserved regions maximizing sensitivity and specificity
A. Barrett (1995)
Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme nomenclature. Recommendations 1992. Supplement 2: corrections and additions (1994).European journal of biochemistry, 232 1
D. Knuth (1998)
The Art of Computer Programming: Volume 3: Sorting and Searching
A. Murzin, S. Brenner, T. Hubbard, C. Chothia (1995)
SCOP: a structural classification of proteins database for the investigation of sequences and structures.Journal of molecular biology, 247 4
L. Falquet, M. Pagni, P. Bucher, N. Hulo, Christian Sigrist, K. Hofmann, A. Bairoch (2002)
The PROSITE database, its status in 2002Nucleic acids research, 30 1
C. Nevill-Manning, Thomas Wu, D. Brutlag (1998)
Highly specific protein sequence motifs for genome analysis.Proceedings of the National Academy of Sciences of the United States of America, 95 11
B. Boser, Isabelle Guyon, V. Vapnik (1992)
A training algorithm for optimal margin classifiers
Motivation: Remote homology detection is the problem of detecting homologyin cases of low sequence similarity. It is a hard computationalproblem with no approach that works well in all cases.Results: We present a method for detecting remote homology that is basedon the presence of discrete sequence motifs. The motif contentof a pair of sequences is used to define a similarity that isused as a kernel for a Support Vector Machine (SVM) classifier.We test the method on two remote homology detection tasks:prediction of a previously unseen SCOP family and prediction ofan enzyme class given other enzymes that have a similar functionon other substrates. We find that it performs significantlybetter than an SVM method that uses BLAST or Smith-Watermansimilarity scores as features.Availability: The software is available from the authors upon request.Contact: [email protected]: remote homology, discrete sequence motifs, sequence similarity,Support Vector Machines, kernel methods*To whom correspondenceshould be addressed.
Bioinformatics – Oxford University Press
Published: Jul 3, 2003
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.