TY - JOUR AU - Mushegian, Arcady R. AB - Motivation: Hybridization of oligonucleotides with longer nucleotide sequences is an essential step in nucleic acid biosynthesis in vitro and in vivo, in oligonucleotide-based diagnostics, and in therapeutic applications of oligonucleotides. A major factor determining sensitivity and selectivity of hybridization is the number of base pair mismatches that occur in an ungapped alignment of the oligonucleotide (probe) and a longer sequence (target).Results: The k-distance match count between the probe and the target is defined as the number of ungapped alignments between the two sequences that have exactly k mismatches, and the k-neighbor match count is defined as the sum of the j-distance match counts for j between 0 and k. We derive a novel formula for the probability of a k-distance match. This formula is based on the assumption that the target is strand-symmetric Bernoulli text (i.e. nucleotides are independently, identically distributed in the target and satisfy Chargaff's second parity rule). Our model predicts that the GC-content in both the probe and the target significantly affects the match count expectation. The ratio of k-neighbor match counts in two distinct genomes for a given probe is a measure of its specificity. We calculated such ratios for pairs of bacterial genomes with different combinations of length, GC-content and phylogenetic distance. Examination of the extreme values of these ratios indicates that probes with a high discriminative power exist for each tested pair.Supplementary information: Stowers Institute Technical Report No. 0002, C++ source code, Mathematica notebooks and other information is available at http://www.stowers-institute.org/labs/bioinformatics/omm/index.htm TI - Distribution of words with a predefined range of mismatches to a DNA probe in bacterial genomes JF - Bioinformatics DO - 10.1093/bioinformatics/btg374 DA - 2004-01-01 UR - https://www.deepdyve.com/lp/oxford-university-press/distribution-of-words-with-a-predefined-range-of-mismatches-to-a-dna-1AI8PSHKJH SP - 67 EP - 74 VL - 20 IS - 1 DP - DeepDyve ER -