TY - JOUR AU - Frasch, Wayne, D. AB - Abstract Implementation of DNA computers has lagged behind the theoretical advances due to several technical limitations. These limitations include the amount of DNA required, the efficiency and accuracy of methods to generate and purify answers, and the lack of a reliable method to read the answer. Here we show how to perform calculations using a reasonable amount of DNA with greater efficiency and accuracy and a new readout method that was used to successfully solve a problem with 15 vertices and 210 edges, the largest problem ever solved with DNA. These advances will provide new opportunities for DNA computing to perform practical computations that utilize the massively parallel nature of DNA hybridization. Insight, innovation, integration Biological computing integrates modern molecular biology techniques and computer science to build organic constructs capable of performing computations. Small synthetic pieces of DNA interact in specific ways, such that competition between strands drives the population of hybridized DNA to form a construct that represents the solution to a combinatorial optimization problem. Each strand formed can be a different answer in multiple copies proportional to the optimality of the answer. The spontaneous hybridization of complementary DNA strands performs the calculation, while subsequent ligation, PCR, magnetic affinity purification, and LCR stabilize, isolate, and read the solutions to the problems. Biological computing has the ability to perform massively parallel computations that are prohibitively complex for in silico computers to perform. Introduction Despite exponential growth of processing power, in silico computers can still only find approximate solutions to large nondeterministic polynomial (NP) problems. A linear increase in the number of variables results in a factorial increase in the number of potential solutions, and therefore computing time. Although advanced heuristic methods have increased the performance of in silico computers by providing approximate answers,1 they lack the massive parallelism and data storage required to solve such computations quickly. Molecular computing approaches provide a possible avenue to solve these complex problems.2–14 Among these approaches, the size and abundance of DNA molecules as well as their ability to hybridize simultaneously in solution make them an appealing material for use in the construction of biomolecular computers. A Hamiltonian Path Problem (HPP) with 7 vertices and 13 edges was the first problem solved using DNA to make computations.5 The problem was to determine if a path exists through the graph starting and ending at the same vertex that visits each vertex once and only once. To solve this problem, DNA sequences representing all edges and vertices were encoded in single-stranded oligonucleotides. Hybridization allowed the generation of all potential paths encoded in double stranded DNA, and the large number of random paths was screened by the combination of polymerase chain reaction (PCR), gel purification, and magnetic bead separation to remove non-Hamiltonian paths. The presence or absence of remaining DNA strands then provided the “Yes” or “No” answer to the HPP solved. Other strategies to advance DNA computing have emerged, including a way to solve a maximal clique problem on a graph with 6 vertices and 11 edges,8 and an approach using both DNA and RNA to solve a chess problem.9 Braich et al.10 also used DNA computing to solve a 20-variable instance of a 3-SAT problem. It is now apparent that the approach used by Adleman5 has several limitations, the most serious of which is the impossibly large number of molecules required to perform more complex computations. To solve an HPP with 200 vertices using this approach, the amount of DNA required would exceed the weight of the Earth.15 Furthermore, although the computation process occurs rapidly, the extraction steps are labor intensive and time consuming. To reduce the number of steps required to perform a computation DNA hairpin structures were used to solve a SAT problem in an autonomous manner.16 To improve scalability, computations were performed through DNA mediated surface computing.17 Problems with the graduated PCR-based answer readout were improved by using a structure-specific cleavage reaction,18 as well as real time PCR, which increased quantitation and precision in answer detection.19 However, these techniques still have not allowed DNA computing to be scaled to solve larger problems. The traveling salesman problem (TSP) is a variant of the HPP. Given a directed weighted graph in which the vertices represent the cities, the edges represent the roads, and the weights are the cost or distance of that road, the TSP asks for the most efficient route that visits each city exactly once and then returns to the starting city. The TSP is very important to companies like Federal Express, UPS and DHL who face this problem daily, in addition to any company that has a retail distribution network. The TSP has been attempted both experimentally20–22 and by the development of a theoretical model23 that is more efficient than what was solved by Adleman.5 However, these methods fall short of practical DNA computing. Methods to solve the TSP have been proposed where the weights of the edges are represented by sequence lengths,21 concentration,22 or melting temperature.24 However, sequence length is not appropriate for representing real values and becomes impractical as the number of different weights increases. When edge weight was encoded by DNA concentration, technical difficulties of extracting a single optimal solution from the DNA band prevented the most efficient path from being found.22 The use of melting temperature is limited by the precision of available biochemical techniques. We now report a new approach that uses manageable amounts of DNA to solve an asymmetric TSP with 15 vertices, 210 edges, and 1.3 × 1012 possible solutions. This approach includes technical innovations that overcome problems that previously limited the development of DNA computers. This is the largest problem solved by molecular computing to date. The method we developed exploits the truly random process inherent in molecular interactions to generate an optimal subset of answers, and demonstrates that interactions of DNA molecules can be used to generate a random sample of a population. A similar process is not possible using in silico computers due to the inability of deterministic circuits to generate truly random samples.25 Methods Oligonucleotide design and construction Except for the starting city, Astart, and ending city, Aend, each city was represented by a synthetic 20-mer sequence of DNA that was chosen to minimize cross hybridization (Table 1). Each 20-mer had a GC content between 30% and 35% such that melting points varied from 60.6 to 62.0 °C, and were composed of two unique 10-mers. Any two city sequences could be linked together by a 20-mer pathway sequence composed of two 10-mers that were complementary to the last and first 10 nucleotides in the sequences of the former and latter cities, respectively (Fig. 1). Pathways were made to complement every combination of cities with the exception of the 5′ and 3′ ends of Astart and Aend, respectively. The Astart and Aend sequences were 30-mers with a higher GC content (66% and 72%) that contained 20-mers at the 5′ and 3′ ends, respectively, for which no complementary sequences were present in the pathway sequences. Thus, incorporation of Astart or Aend into an answer sequence prevented further growth on that end of the answer sequence. The start and end sequences also served as primer sequences for downstream amplification by PCR. The higher GC content of Astart and Aend was important for answer purification by increasing the efficiency and fidelity of PCR amplification. Fig. 1 Open in new tabDownload slide Computation based upon the hybridization and ligation of DNA. Ligation results in the formation of the ordered city pair (B→C). DNA models and sequences are of cities B, C from Table 1, and the pathway BC. Cities are written 5′ to 3′, and the pathway is written 3′ to 5′. The red line represents the phosphodiester bond formed during the ligation. Fig. 1 Open in new tabDownload slide Computation based upon the hybridization and ligation of DNA. Ligation results in the formation of the ordered city pair (B→C). DNA models and sequences are of cities B, C from Table 1, and the pathway BC. Cities are written 5′ to 3′, and the pathway is written 3′ to 5′. The red line represents the phosphodiester bond formed during the ligation. Table 1 City sequences used in the calculation Open in new tab Table 1 City sequences used in the calculation Open in new tab All oligonucleotides were purified by reverse phase high-performance liquid chromatography (HPLC) using a Wave System (Transgenomics) with a DNA SEP-HT cartridge. The injection buffer was 75/25% buffer A/B (Transgenomics), and the DNA was eluted from the column using a method that went to a final concentration of 100% buffer B in 8 min. Oligonucleotides (98.2% purity) were condensed to 200 μM by a DNA SpeedVac (Thermo). This method also allowed us to purify phosphorylated from unphosphorylated DNA. Hybridization and ligation conditions Hybridization was performed by heating the answer formation reaction medium (AFRM) to 92 °C for 4 min, followed by cooling at 1 °C min−1 to 8 °C. The AFRM was composed of all relevant oligonucleotides in T4 LigaseBuffer (Fermentas). Ligation was performed by incubating at 8 °C for ∼16 h after addition of T4 ligase (5 Weiss Units), 20 mM DTT, and 10 mM ATP to the AFRM. Purification by magnetic beads and PAGE Magnetic affinity purification was carried out by incubating 0.75 μl of biotinylated probe sequence (400 μM) with 150 μl of M-280 streptavidin-coated magnetic beads (∼1.5 mg, Dynal Biotech) in 149 μl of 5× Binding Wash buffer (BW) for 45 min at 25 °C. The beads were washed three times in 500 μl of BW buffer both before and after incubation with the biotinylated probe. The beads were separated from excess probe using a magnetic separator after which 90 μl of ssDNA answer solution were added along with 30 μl of 20× SSC buffer. Answer sequences were hybridized to the bead-bound probes by gentle vortexing and incubation at 25 °C for 1 h. Strands missing single or multiple city sequences were removed by washing twice with 500 μl of 2× SSC buffer, then again with 500 μl of 0.5× SSC. Captured answer sequences were dissociated from the bead-bound probe in 80 μl of 0.1 M NaOH for 10 min after separation using a magnetic separator. The supernatant containing the purified ssDNA answers was neutralized by adding 8.2 μl of 1 M HCl, 10 μl of H2O, and 2 μl of 0.25 M Tris-Cl, pH 7.5. Ligation products were separated by a 6% super denaturing PAGEgel (acrylamide : bisacrylamide, 29 : 1) with 8.5 M urea and 30% formamide in 100 mM Tris-Cl, pH 8.3, 83 mM boric acid, and 2 mM EDTA at 65 °C under 8 V cm−1. The polymerization rate of the gel was controlled using chemical and photochemical catalysis to increase resolution. The 340-mer band was collected from the gel using a Qiagen gel extraction kit. LCR products were profiled on a 10% PAGEgel (acrylamide : bisacrylamide, 29 : 1) in 100 mM Tris-Cl, pH 8.3, 83 mM boric acid, and 2 mM EDTA at 55 °C under 8 V cm−1. All PAGE gels were stained with ethidium bromide (1 mg ml−1) for 10 min, then visualized and photographed with a UVP BioDoc-It™ UV transilluminator. Amplification by PCR and LCR The purified sequences were PCR amplified in a 50 μl reaction mixture containing 10 mM Tris-HCl, pH 8.3, 50 mM KCl, 1.5 mM MgCl2, 0.001% gelatin, 200 μM of each dNTP, 0.4 μM of each of the Astart and Aend primers, a 1 : 100 ratio of Bst and Taq DNA polymerase, and 0.1–0.2 μg of DNA template. The PCR reaction occurred in a PJ2000 DNA thermal cycler programmed for a hot start at 94 °C for 2.5 min followed by 35 cycles of 94 °C for 0.5 min, 70 °C for 35 s, and 72 °C for 30 s, concluded by 3 min at 72 °C. The LCR reaction mixture contained 5 μl target DNA (∼10 ng), 40 units Taqligase (NEB), 10× Taqligasebuffer, and 400 nM of each of four probes in a 50 μl final volume. The two pairs of probes included two city sequences and their two complements. The 5′ end at the ligation site was phosphorylated for each pair of probes. The LCR was initiated by heating the solution to 94 °C for 2.5 min, following by 25 cycles with each cycle consisting of 94 °C for 25 s, 41 °C for 35 s, and 45 °C for 150 s. Results Table 2 shows the Markov efficiency matrix that defines the pathways in the fully connected, asymmetric 15-city TSP computed using DNA where each letter represents a different city. The efficiency values in the Table were translated into concentrations (1 : 1 pmol) of the pathway sequences used in the computation. The problem was designed to provide an optimal answer with the cities appearing in alphabetical order by adding 100-fold more pathway sequences along the upper diagonal of the distance matrix. A problem with a known answer was solved to test if the computation using DNA would return the optimal answer. Table 2 Markov efficiency matrix for the 15-city TSP edges As B C D E F G H I J K L M N O Ae As * 100a 1 1 1 1 1 1 1 1 1 1 1 1 1 1 B 0 * 100 1 1 1 1 1 1 1 1 1 1 1 1 1 C 0 1 * 100 1 1 1 1 1 1 1 1 1 1 1 1 D 0 1 1 * 100 1 1 1 1 1 1 1 1 1 1 1 E 0 1 1 1 * 100 1 1 1 1 1 1 1 1 1 1 F 0 1 1 1 1 * 100 1 1 1 1 1 1 1 1 1 G 0 1 1 1 1 1 * 100 1 1 1 1 1 1 1 1 H 0 1 1 1 1 1 1 * 100 1 1 1 1 1 1 1 I 0 1 1 1 1 1 1 1 * 100 1 1 1 1 1 1 J 0 1 1 1 1 1 1 1 1 * 100 1 1 1 1 1 K 0 1 1 1 1 1 1 1 1 1 * 100 1 1 1 1 L 0 1 1 1 1 1 1 1 1 1 1 * 100 1 1 1 M 0 1 1 1 1 1 1 1 1 1 1 1 * 100 1 1 N 0 1 1 1 1 1 1 1 1 1 1 1 1 * 100 1 O 0 1 1 1 1 1 1 1 1 1 1 1 1 1 * 100 Ae 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * As B C D E F G H I J K L M N O Ae As * 100a 1 1 1 1 1 1 1 1 1 1 1 1 1 1 B 0 * 100 1 1 1 1 1 1 1 1 1 1 1 1 1 C 0 1 * 100 1 1 1 1 1 1 1 1 1 1 1 1 D 0 1 1 * 100 1 1 1 1 1 1 1 1 1 1 1 E 0 1 1 1 * 100 1 1 1 1 1 1 1 1 1 1 F 0 1 1 1 1 * 100 1 1 1 1 1 1 1 1 1 G 0 1 1 1 1 1 * 100 1 1 1 1 1 1 1 1 H 0 1 1 1 1 1 1 * 100 1 1 1 1 1 1 1 I 0 1 1 1 1 1 1 1 * 100 1 1 1 1 1 1 J 0 1 1 1 1 1 1 1 1 * 100 1 1 1 1 1 K 0 1 1 1 1 1 1 1 1 1 * 100 1 1 1 1 L 0 1 1 1 1 1 1 1 1 1 1 * 100 1 1 1 M 0 1 1 1 1 1 1 1 1 1 1 1 * 100 1 1 N 0 1 1 1 1 1 1 1 1 1 1 1 1 * 100 1 O 0 1 1 1 1 1 1 1 1 1 1 1 1 1 * 100 Ae 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * a Indicates the distance of the path As→B. Open in new tab Table 2 Markov efficiency matrix for the 15-city TSP edges As B C D E F G H I J K L M N O Ae As * 100a 1 1 1 1 1 1 1 1 1 1 1 1 1 1 B 0 * 100 1 1 1 1 1 1 1 1 1 1 1 1 1 C 0 1 * 100 1 1 1 1 1 1 1 1 1 1 1 1 D 0 1 1 * 100 1 1 1 1 1 1 1 1 1 1 1 E 0 1 1 1 * 100 1 1 1 1 1 1 1 1 1 1 F 0 1 1 1 1 * 100 1 1 1 1 1 1 1 1 1 G 0 1 1 1 1 1 * 100 1 1 1 1 1 1 1 1 H 0 1 1 1 1 1 1 * 100 1 1 1 1 1 1 1 I 0 1 1 1 1 1 1 1 * 100 1 1 1 1 1 1 J 0 1 1 1 1 1 1 1 1 * 100 1 1 1 1 1 K 0 1 1 1 1 1 1 1 1 1 * 100 1 1 1 1 L 0 1 1 1 1 1 1 1 1 1 1 * 100 1 1 1 M 0 1 1 1 1 1 1 1 1 1 1 1 * 100 1 1 N 0 1 1 1 1 1 1 1 1 1 1 1 1 * 100 1 O 0 1 1 1 1 1 1 1 1 1 1 1 1 1 * 100 Ae 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * As B C D E F G H I J K L M N O Ae As * 100a 1 1 1 1 1 1 1 1 1 1 1 1 1 1 B 0 * 100 1 1 1 1 1 1 1 1 1 1 1 1 1 C 0 1 * 100 1 1 1 1 1 1 1 1 1 1 1 1 D 0 1 1 * 100 1 1 1 1 1 1 1 1 1 1 1 E 0 1 1 1 * 100 1 1 1 1 1 1 1 1 1 1 F 0 1 1 1 1 * 100 1 1 1 1 1 1 1 1 1 G 0 1 1 1 1 1 * 100 1 1 1 1 1 1 1 1 H 0 1 1 1 1 1 1 * 100 1 1 1 1 1 1 1 I 0 1 1 1 1 1 1 1 * 100 1 1 1 1 1 1 J 0 1 1 1 1 1 1 1 1 * 100 1 1 1 1 1 K 0 1 1 1 1 1 1 1 1 1 * 100 1 1 1 1 L 0 1 1 1 1 1 1 1 1 1 1 * 100 1 1 1 M 0 1 1 1 1 1 1 1 1 1 1 1 * 100 1 1 N 0 1 1 1 1 1 1 1 1 1 1 1 1 * 100 1 O 0 1 1 1 1 1 1 1 1 1 1 1 1 1 * 100 Ae 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * a Indicates the distance of the path As→B. Open in new tab Computations were made by hybridization of the city sequences (Table 1) with pathway sequences in such a way that two city sequences become ligated in an order-specific manner. An example computation is shown in Fig. 1 where Cities B and C become ligated in the order B→C. To make this computation the first (3′) and last (5′) halves of the BC pathway oligonucleotide hybridized to the last (3′) and first (5′) halves of the City B and C sequences, respectively. Answer formation involved a three stage hybridization/ligation procedure beginning with the addition of all city and pathways sequences except Aend and any pathway traveling to Aend. This minimized premature truncation of the answer sequences. The second stage included another round of hybridization/ligation to promote elongation of the answer sequences. In the final stage, Aend and those pathways that travel to Aend were added along with a final round of hybridization/ligation that facilitated the truncation of the answer sequences with Aend. Fig. 2 shows a gel of the 340-mer answer sequences. The correct answer sequences were isolated from incorrect sequences by purifying 340-mers from the hybridization/ligation product using a super denaturing PAGEgel. Sequences containing both Astart and Aend were selected using magnetic affinity purification, and then amplified by PCR using Astart and Aend primers. The PCR product was then subjected to successive magnetic bead purifications in which the beads were coated with sequences complementary to each of the cities B to O. The successive magnetic purification steps ensured that all remaining sequences contained one and only one copy of each city sequence, and thus represented Hamiltonian paths through the city network. It is noteworthy that the band in Fig. 2C was composed of a multitude of Hamiltonian paths (Fig. 3). If the computation were driven by the probability of hybridization, then we anticipate that the optimal answer should be present in the greatest abundance. Fig. 2 Open in new tabDownload slide Profiles of DNA computing products by super denaturing PAGE. Lane A: product of hybridization/ligation from the answer formation process. The arrow indicates the 340-mer band that was excised for subsequent purification. Lane B: PCR product of DNA excised from A after magnetic affinity purification for Astart and Aend. Lane C: PCR product of DNA excised from B after magnetic affinity purification for cities B–O. Lanes M: the molecular marker where the brightest band is 100 bp. Fig. 2 Open in new tabDownload slide Profiles of DNA computing products by super denaturing PAGE. Lane A: product of hybridization/ligation from the answer formation process. The arrow indicates the 340-mer band that was excised for subsequent purification. Lane B: PCR product of DNA excised from A after magnetic affinity purification for Astart and Aend. Lane C: PCR product of DNA excised from B after magnetic affinity purification for cities B–O. Lanes M: the molecular marker where the brightest band is 100 bp. Fig. 3 Open in new tabDownload slide Sample population of answer sequences that may be found in the single 340-mer band Fig. 2C. The computation was designed to return an optimal answer of AsBCDEFGHIJKLMNOAe. If the computer worked, then the optimal answer should occur in the largest abundance (N copies), while k other correct but suboptimal Hamiltonian paths may potentially exist at lower abundances (n1, n2,…,nk copies respectively). The population should have a distribution with N >> ni, where i = 1,2,3,…,k, and k ≤ 15!, though the actual number of suboptimal paths formed is likely to be much smaller than 15! given the amount of DNA used to generate the answers. Fig. 3 Open in new tabDownload slide Sample population of answer sequences that may be found in the single 340-mer band Fig. 2C. The computation was designed to return an optimal answer of AsBCDEFGHIJKLMNOAe. If the computer worked, then the optimal answer should occur in the largest abundance (N copies), while k other correct but suboptimal Hamiltonian paths may potentially exist at lower abundances (n1, n2,…,nk copies respectively). The population should have a distribution with N >> ni, where i = 1,2,3,…,k, and k ≤ 15!, though the actual number of suboptimal paths formed is likely to be much smaller than 15! given the amount of DNA used to generate the answers. To identify the optimal solution the answers in the purified 340 bp band (Fig. 2) were analyzed by determining the relative abundance of all ordered pairs of cities in the answer sequences. This was accomplished by performing n2–n = 210 different LCR on the answer sequences in parallel using two pairs of probes, one pair for each city in the ordered pair tested. The pairs of probes can only become ligated when the cities are adjacent to one another in a specific order. The use of saturating amounts of probes ensured that the yield of LCR product for a given ordered pair of cities was dependent upon the abundance of the ordered pair in the answer sequences. Fig. 4 shows the PAGE profiles of the LCR products of all 210 ordered pairs of cities generated from the purified answer sequences. The results are organized so that each gel has the same first city of each ordered pair, where each lane is the second city of each ordered pair. The lower bands in each lane correspond to 20-mer DNA city probes that were not ligated during LCR. The upper bands correspond to the 40-mer LCR product composed of the ligated probes and indicate the abundance of that ordered city pair in the answer pool. Fig. 4 Open in new tabDownload slide PAGE profile of 40-mer LCR products for every possible city pairing. Each lane tests the relative abundance of the specific ordered city pairing in the answer sequences. Every lane was loaded with 10 μl of LCR product and contained an excess amount of probe DNA that appears as a 20-mer band in every lane. The abundance of the ordered city pair in the sample population is proportional to the intensity of a 40-mer band that may be formed by LCR from the two pairs of 20-mer probes. The letter to the left of each gel indicates the preceding city and the letter above each lane defines the latter city of the ordered city pair. Arrows indicate the most abundant 40-mer LCR product band on each gel. Fig. 4 Open in new tabDownload slide PAGE profile of 40-mer LCR products for every possible city pairing. Each lane tests the relative abundance of the specific ordered city pairing in the answer sequences. Every lane was loaded with 10 μl of LCR product and contained an excess amount of probe DNA that appears as a 20-mer band in every lane. The abundance of the ordered city pair in the sample population is proportional to the intensity of a 40-mer band that may be formed by LCR from the two pairs of 20-mer probes. The letter to the left of each gel indicates the preceding city and the letter above each lane defines the latter city of the ordered city pair. Arrows indicate the most abundant 40-mer LCR product band on each gel. In each gel the abundance of LCR product was significantly higher for one ordered city pair than the others as indicated by the red arrows. For each gel, the lane for the alphabetical ordered pair (e.g. Astart→B, B→C, C→D, etc.) contained the most LCR product. These dominant bands corresponded to the upper diagonal in the efficiency matrix which had the largest value (Table 2). The relative abundance of ligation products in each lane can be determined by densitometry of each lane in the gel and normalizing to the density of the DNA ladder used as a control. However, the answer was clear enough that it could be determined by inspection of the PAGE gels directly without the need to quantify the bands further. It is noteworthy that LCR products were also formed in minor amounts for some ordered city pairs that represented suboptimal answers. This indicates that the DNA computer was indeed sampling the entire answer space during the computation. The results from all the gels show that the path which exists in the greatest abundance was AstartBCDEFGHIJKLMNOAend. Since this was the optimal answer, these results demonstrate that the DNA computer successfully computed the solution to the problem. Conclusion The consistency between the final optimal answer calculated by DNA computing and the initial design confirms the feasibility of the design of the computer. The computer was used to solve a TSP with 15 vertices and 210 edges, the largest problem to be solved with DNA to date. It is not clear how many operations were performed during the execution of this problem since not all possible answers are formed. However, the number can be estimated using the concentration of the cities (1 picomolar), resulting in approximately 1.5 × 1013 operations during the hybridization. Hybridization was estimated to occur when the temperature decreased from 70 °C to 60 °C based on the melting temperatures of the sequences used (Table 1). Thus during this 10 minute time period the computer performed 250 trillion operations per second. The ability to solve a problem of this magnitude using DNA was only possible as the result of the development of several technical innovations presented here. During answer formation, a novel multistep ligation process was used to promote the formation, elongation, and termination of correct answers. Prior to amplification by PCR, a magnetic purification step was introduced to ensure that all retained sequences contained both Astart and Aend, and thus minimized inappropriate strand elongation during PCR. During answer amplification, cross hybridization was minimized and the fidelity of PCR was maximized by the use of longer Astart and Aend sequences with higher GC content that increased the melting temperature. Previous studies used standard PAGE protocols, which results in a smeared profile due to the formation of secondary structure in the answer sequences.5,22,24,26,27Secondary structure formation during electrophoresis was avoided here through the use of super denaturing PAGE such that answer sequences were separated based upon their true length. It is noteworthy that in previous DNA computing5,22,24,26,27water and heat were used for magnetic bead separation. However, magnetic beads are unable to remove the selected target molecules under these conditions because water and heat have been shown to break the avidin–biotin link.28,29 To avoid dissociation of the biotinylated probe from the magnetic beads during thermal denaturation, we used NaOH at room temperature to denature the answer sequences from the probes bound to the magnetic beads. This method successfully isolated the targets and eliminated the contamination of the answer pool by short complementary probe sequences that were able to act as PCR primers during downstream amplification steps. Answers were successfully found through the use of LCR to determine the abundance of ordered pairs of cities in the answer read-out step, thereby avoiding the problems inherent to other methods.5,19–22,30 The results presented here demonstrate that interactions of DNA molecules can be used to generate a random sample of a population by solving a problem with 1.3 × 1012 possible solutions. When edge efficiency is encoded by the concentration of the edge DNA, the answer population is symmetrically distributed where the mode and mean correspond to the optimal answer. The probability of any one particular path forming can be defined as the product of each edge in the path being traversed. Thus for the path (A,B,C,D,E) the probability was PA,E=P(A,B)*P(B,C)*P(C,D)*P(D,E). The probability of any one connection was P(A,B)=[AB]∑[AX], where X is a member of the set of all vertices in the graph, the square brackets denote concentration. Thus, the larger the concentration of a particular edge, the more likely it will be part of a path. Since there was a non-zero probability of any particular path in the graph being formed, the entire solution space of the problem was searched. Generating a subset of answers is equivalent to taking a random sampling of the population where the mean of the population is related to the mean of the sample using the classic relation: μ = Sample Mean ± error. The error can be estimated from the standard deviation of the sample divided by the square root of the sample size. In this case the sample size is the number of molecules (∼108–1010 based upon a ligation efficiency of 70%) that form correct answers. Thus, the Law of Large Numbers ensures that a statistically relevant sample can be generated. Consequently, the number of possible answers to the problem can exceed the molecules in the reaction that allow problems with a large number of variables to be solved with manageable amounts of DNA. Probabilistic computing is limited to solving optimization problems, since it generates a subset of answers. Thus it would not be able to solve a Hamiltonian path problem with more paths than the number of molecules in the computation. However, it is well suited to solve problems where there is one particular answer or set of answers that is better than the others. For example, a flow capture problem is well suited to be solved by probabilistic computing. This type of problem in particular highlights the power of the method, since pathways can potentially contain infinite cycles that cause problems for heuristic algorithms. Since the method used here relies on the formation of answers based upon the probability of DNA hybridization, there is nothing to explicitly prevent long cycles. However, long cycles will occur with relatively low probability, since every additional hybridization event decreases the likelihood of formation. References 1 A. B. Kahng and S. Reda, Op. Res. Lett. , 2004 , 32 , 499 – 509 . Crossref Search ADS 2 L. Bomble , D. Lauvergnat, F. Remacle and M. Desouter-Lecomte, J. Chem. Phys. , 2008 , 128 , 064110 . Crossref Search ADS PubMed 3 K. Flensberg , Nat. Nanotechnol. , 2008 , 3 , 72 – 73 . Crossref Search ADS PubMed 4 J. Zhang , U. Lourderaj, S. V. Addepalli, W. A. de Jong and W. L. Hase, J. Phys. Chem. A , 2008 10.1021/jp808146c . Crossref 5 L. M. Adleman , Science , 1994 , 266 , 1021 – 1024 . Crossref Search ADS PubMed 6 P. Fu , Biotechnol. J. , 2007 , 2 , 91 – 101 . Crossref Search ADS PubMed 7 R. J. Lipton , Science , 1995 , 268 , 542 – 545 . Crossref Search ADS PubMed 8 Q. Ouyang , P. D. Kaplan, S. Liu and A. Libchaber, Science , 1997 , 278 , 446 – 449 . Crossref Search ADS PubMed 9 D. Faulhammer , A. R. Cukras, R. J. Lipton and L. F. Landweber, Proc. Natl. Acad. Sci. U. S. A. , 2000 , 97 , 1385 – 1389 . Crossref Search ADS PubMed 10 N. C. Ravinderjit , S. Braich, C. Johnson, P. W. K. Rothemund and L. Adleman, Science , 2002 , 296 , 499 – 502 . Crossref Search ADS PubMed 11 J. Gomez-Segura , J. Veciana and D. Ruiz-Molina, Chem. Commun. , 2007 3699 – 3707 . 12 K. B. Holt , Philos. Trans. R. Soc. London, Ser. A , 2007 , 365 , 2845 – 2861 . Crossref Search ADS 13 K. Y. Sanbonmatsu and C. S. Tung, J. Struct. Biol. , 2007 , 157 , 470 – 480 . Crossref Search ADS PubMed 14 P. Warren , IEE Proc.: Nanobiotechnol. , 2004 , 151 , 1 – 9 . Crossref Search ADS PubMed 15 J. Parker , EMBO Rep. , 2003 , 4 , 7 – 10 . Crossref Search ADS PubMed 16 K. Sakamoto , H. Gouzu, K. Komiya, D. Kiga, S. Yokoyama, T. Yokomori and M. Hagiya, Science , 2000 , 288 , 1223 – 1226 . Crossref Search ADS PubMed 17 Q. Liu , L. Wang, A. G. Frutos, A. E. Condon, R. M. Corn and L. M. Smith, Nature , 2000 , 403 , 175 – 179 . Crossref Search ADS PubMed 18 L. Wang , J. G. Hall, M. Lu, Q. Liu and L. M. Smith, Nat. Biotechnol. , 2001 , 19 , 1053 – 1059 . Crossref Search ADS PubMed 19 Z. Ibrahim , M. F. M. Saaid, A. S. Paramita, A. Suyama and J. A. Rose, IEEE Congress on Evolutionary Computation , 2007 1829 – 1834 . 20 D. Spetzler , F. Xiong and W. D. Frasch, Lecture Notes in Computer Science , Springer , Berlin/Heidelberg , 2008 , vol. 4848 , pp. 152 – 160 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 21 A. Narayanan and S. Zorbalas, in Proceedings of Genetic Programming , ed. J. Koza, Morgan Kaufmann , San Fransisco , 1998 , pp. 718 – 723 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 22 M. Yamamoto , N. Matsuura, T. Shiba, Y. Kawazoe and A. Ohuchi, Solutions of Shortest Path Problems by Concentration Control , Springer , Berlin/Heidelberg , 2002 , pp. 203 – 212 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 23 M. H. Garzon , N. Jonoska and S. A. Karl, BioSystems , 1999 , 52 , 63 – 72 . Crossref Search ADS PubMed 24 J. Y. Lee , S.-Y. Shin, T. H. Park and B.-T. Zhang, BioSystems , 2004 , 78 , 39 – 47 . Crossref Search ADS PubMed 25 I. Vattulainen , T. Alanissila and K. Kankaala, Phys. Rev. E: Stat. Phys., Plasmas, Fluids, Relat. Interdiscip. Top. , 1995 , 52 , 3205 – 3214 . 26 J. Y. Lee , S.-Y. Shin, S. J. Augh, T. H. Park and B.-T. Zhang, Lecture Notes in Computer Science , Springer , Berlin/Heidelberg , 2003 , vol. 2568 , pp. 73 – 84 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 27 Z. X. Yin , F. Y. Zhang and J. Xu, J. Chem. Inf. Comput. Sci. , 2002 , 42 , 222 – 224 . Crossref Search ADS PubMed 28 A. Holmberg , A. Blomstergren, O. Nord, M. Lukacs, J. Lundeberg and M. Uhlen, Electrophoresis , 2005 , 26 , 501 – 510 . Crossref Search ADS PubMed 29 S. C. Wu and S. L. Wong, J. Biol. Chem. , 2005 , 280 , 23225 – 23231 . Crossref Search ADS PubMed 30 F. Tanaka , A. Kameda, M. Yamamoto and A. Ohuchi, Nucleic Acids Res. , 2005 , 33 , 903 – 911 . Crossref Search ADS PubMed This journal is © The Royal Society of Chemistry 2009 TI - Solving the fully-connected 15-city TSP using probabilistic DNA computing JF - Integrative Biology DO - 10.1039/b821735c DA - 2009-03-01 UR - https://www.deepdyve.com/lp/oxford-university-press/solving-the-fully-connected-15-city-tsp-using-probabilistic-dna-vSvZYFWOqH SP - 275 EP - 280 VL - 1 IS - 3 DP - DeepDyve ER -