Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Fixed Character States and the Optimization of Molecular Sequence Data

Fixed Character States and the Optimization of Molecular Sequence Data A method is proposed to optimize molecular sequence data that does not employ multiple sequence alignment. This method treats entire homologous contiguous stretches of sequence data as individual characters. This sequence is treated as the homologous unit employed in phylogeny reconstruction. The sets of specific sequences exhibited by the terminal taxa constitute the character states. The number of states is then less than or equal to the number of unique sequences (or homologous fragments) exhibited by the data. A matrix of transformation costs is created to relate the states to one another. The cells of this matrix are defined as the minimum transformation cost between each pair of states based on insertion–deletion and base substitution costs. The diagnosis of a topology then follows existing dynamic programming techniques, with the number of states greatly expanded. Since the possible sequences reconstructed at nodes are limited to those exhibited by the terminals, cladograms constructed in this way may be longer than those of other methods in that they require a greater number of weighted evolutionary events. Example data, the effects of missing data, restricted ancestors, and putative long‐branch attraction are discussed. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Cladistics Wiley

Fixed Character States and the Optimization of Molecular Sequence Data

Cladistics , Volume 15 (4) – Dec 1, 1999

Loading next page...
 
/lp/wiley/fixed-character-states-and-the-optimization-of-molecular-sequence-data-HuIhqCESoS

References (18)

Publisher
Wiley
Copyright
Copyright © 1999 Wiley Subscription Services, Inc., A Wiley Company
ISSN
0748-3007
eISSN
1096-0031
DOI
10.1111/j.1096-0031.1999.tb00274.x
Publisher site
See Article on Publisher Site

Abstract

A method is proposed to optimize molecular sequence data that does not employ multiple sequence alignment. This method treats entire homologous contiguous stretches of sequence data as individual characters. This sequence is treated as the homologous unit employed in phylogeny reconstruction. The sets of specific sequences exhibited by the terminal taxa constitute the character states. The number of states is then less than or equal to the number of unique sequences (or homologous fragments) exhibited by the data. A matrix of transformation costs is created to relate the states to one another. The cells of this matrix are defined as the minimum transformation cost between each pair of states based on insertion–deletion and base substitution costs. The diagnosis of a topology then follows existing dynamic programming techniques, with the number of states greatly expanded. Since the possible sequences reconstructed at nodes are limited to those exhibited by the terminals, cladograms constructed in this way may be longer than those of other methods in that they require a greater number of weighted evolutionary events. Example data, the effects of missing data, restricted ancestors, and putative long‐branch attraction are discussed.

Journal

CladisticsWiley

Published: Dec 1, 1999

There are no references for this article.