Comprehensive transcript analysis in small quantities of mRNA by SAGE-LitePeters, David G.; O'Hare, Elisa Heidrich; Ferrell, Robert E.; Kassam, Amin B.; Yonas, Howard; Brufsky, Adam M.
doi: N/Apmid: N/A
Serial analysis of gene expression (SAGE) is a powerful technique that can be used for global analysis of gene expression. Its chief advantage over other methods is that SAGE does not require prior knowledge of the genes of interest and provides quantitative and qualitative data of potentially every transcribed sequence in a particular tissue or cell type. Furthermore, SAGE can quantify low-abundance transcripts and reliably detect relatively small differences in transcript abundance between cell populations. However, SAGE demands high input levels of mRNA which are often unavailable, particularly when studying human disease. To overcome this limitation, we have developed a modification of SAGE that allows detailed global analysis of gene expression in extremely small quantities of tissue or cultured cells. We have called this approach ‘SAGE-Lite’. This technique was used for the global analysis of transcription in samples of normal and pathological human cerebrovasculature to study the molecular pathology of intracranial aneurysms. These samples, which are obtained during operative surgical repair, are typically no bigger than 1 or 2 mm and yield <100 ng of total RNA. In addition, we show that SAGE-Lite allows simple and rapid isolation of long cDNAs from short (15 bp) SAGE sequence tags.
Improved fidelity of thermostable ligases for detection of microsatellite repeat sequences using nucleoside analogsZirvi, Monib;Barany, Francis;Bergstrom, Donald E.;Saurage, Andrea S.;Hammer, Robert P.
doi: 10.1093/nar/27.24.e41pmid: 10572193
Abstract Microsatellite repeats consisting of dinucleotide sequences are ubiquitous in the human genome and have proven useful for linkage analysis, positional cloning and forensic identification purposes. In this study, the potential of utilizing the ligase detection reaction for the analysis of such microsatellite repeat sequences was investigated. Initially, the fidelity of thermostable DNA ligases was measured for model dinucleotide repeat sequences. Subsequently, the effect of modified oligonucleotides on ligation fidelity for dinucleotide repeats was determined using the nucleoside analogs nitroimidazole, inosine, 7-deazaguanosine and 2-pyrimidinone, as well as natural base mismatches. The measured error rates for a standard dinucleotide template indicated that the nitroimidazole nucleoside analogs could be used to increase the fidelity of ligation when compared to unmodified primers. Furthermore, use of formamide in the ligation buffer also increased ligation fidelity for dinucleotide repeat sequences. Using ligation-based assays to detect polymorphic alleles of microsatellite repeats in the human genome opens the possibility of using array-based typing of these loci for human identification, loss-of-heterozygosity studies and linkage analysis. Introduction Microsatellites are polymorphic repetitive sequences which occur throughout the human genome. Analysis of these repetitive sequences has proved very useful in human genome studies due to their highly polymorphic nature and the ability to determine chromosomal segregation patterns in human pedigrees. Association studies which identify given microsatellite loci with the transmission of a disease phenotype have been used extensively in the last decade for the positional cloning of a number of genes responsible for inherited human disorders such as cystic fibrosis (CFTR), Duchenne muscular dystrophy, Wilms' tumor (WT1) and familial adenomatous polyposis (1–5). In general, microsatellite repeats are analyzed by PCRamplification followed by electrophoretic separation on gels (6,7); however, detection of variations occurring within microsatellite repeats and closely clustered mutations is complicated by artifacts that occur during PCR. Our laboratory has developed a combined polymerase chain reaction/ligase detection reaction (PCR/LDR) method for discriminating single-base mutations, as well as mononucleotide repeat polymorphisms (8–12). In this study, detection of alterations in dinucleotide repeat sequences was carried out using ligation mediated by thermostable ligases. The effect of using modified oligonucleotide primers in ligation for this standard template was also determined for various nucleoside analogs. The analogs tested included nitroimidazole, inosine, 7-deazaguanosine and 2-pyrimidinone, as well as natural base mismatches. Many of these nucleoside analogs have the potential to modulate ligation fidelity through a combination of effects on melting temperature and enzyme-template interactions. Materials and Methods Modified nucleoside analogs The nitroimidazole nucleoside phosphoramidite was synthesized according to previously published procedures (13). The pyrimidin-2-one nucleoside has been prepared previously by a variety of methods including coupling of mercury salt of pyrimidin-2-one to 2-deoxy chloro sugar (14) and deamination of 2′-deoxycytosine (15). We have used a typical silylation/nucleosidation procedure of 3,5-bis-O-p-toluoyl-1-chloro-β-d-ribose to provide the β-anomer in 40% yield. Standard deprotection, tritylation and phosphitylation protocols were used to convert this to the nucleoside phosphoramidite which could then be incorporated into oligonucleotides (see experimental details below). The inosine, 7-deazaguanosine, dP and 5-fluoro-deoxyuridine phosphoramidites were obtained from Glen Research (Sterling, VA). The various nucleoside analogs were then incorporated into ligation primers using standard oligonucleotide synthesis conditions. The sequences of the various primers used are shown in Table 1. Table 1 View largeDownload slide Synthetic templates and nucleoside analog modified LDR primers Table 1 View largeDownload slide Synthetic templates and nucleoside analog modified LDR primers 1-(2′-Deoxy-3′,5′-bis-O-p-toluoyl-β-d-ribofuranosyl)-pyrimidin-2-one Hexamethyldisilazane (6.9 ml, 31.4 mmol) was added to a flask containing 2-pyrimidinone hydrochloride (0.93 g, 7.0 mmol) with a few crystals of ammonium sulfate. The reaction mixture was heated until reflux for 1 h resulting in a yellowish solution. Once cool, the excess solvent was removed under diminished pressure. Immediately following the removal of the solvent, freshly distilled chloroform was added under argon, followed by addition of 3,5-bis-O-p-toluoyl-1-chloro-β-d-ribose (16) (1.40 g, 3.6 mmol). After stirring overnight, the reaction mixture was washed with saturated NaHCO3, H2O and saturated NaCl, dried with Na2SO4, filtered and the solvent removed under reduced pressure yielding a creamy white solid. This solid was redissolved in chloroform and crystallized overnight using 3:1 hexanes:EtOAc at reduced temperature. The resulting crystals (>95% β-anomer) were filtered and rinsed with cool 3:1 hexanes:EtOAc yielding a white solid (0.65 g, 40%). 1H(200MHz, CDCl3), 8.65 (t, 1H, H4), 8.05 (dd, 1H, H6), 7.96 (d, 4H, Ar), 7.61 (d, 4H, Ar,), 6.32 (m, 2H, H5, H1′), 5.58 (d, 1H, H3′), 4.93 (t, 1H, H4″), 4.59 (d, 2H, H5′, H5″), 3.10–2.99 (m, 1H, H2′), 2.77 (d, 1H, H2′), 2.42 (s, 3H, CH3), 2.41 (s, 3H, CH3). 13C (50 MHz, CDCl3) δ 166.17 [C(O)], 165.79 (C4), 165.42 [C(O)], 155.23 (C2), 144.26 [C(O)-Ar], 143.82 [C(O)-Ar], 142.8 (C6), 129.66 (Ar), 129.11 (Ar), 126.83 (Ar-CH3), 126.55 (Ar-CH3), 103.4 (C5), 89.84 (C1′), 82.84 (C4′), 75.06 (C3′), 64.03 (C5′), 37.41 (C2′), 21.66 (CH3). 1-(2′-Deoxy-β-d-ribofuranosyl)-pyrimidin-2-one 1-(2′-Deoxy-3′, 5′-bis-O-p-toluoyl-β-d-ribofuranosyl)-pyrimidin-2-one (0.686 g, 1.49 mmol) was deprotected using ammoniasaturated methanol (100 ml) overnight. The solvent was removed under diminished pressure. A solution of 9:1 chloroform: methanol was added to the dried reaction mixture to precipitate a white solid which was in 43% yield (0.136 g, 0.641 mmol). 1H (200MHz, CD3OD) d 8.52 (t, 1H, H4), 8.22 (dd, 1H, H6), 6.47 (t, 1H, H5), 5.97 (d, 1H, H1′), 4.19 (t, 1H, H3′), 4.10 (s, 1H, H4′), 3.36 (d, 2H, H5′, H5″), 2.62–2.51 (m, 1H, H2′), 1.97 (d, J = 11.2, 1H, H2′). 1-(2′-Deoxy-5′-dimethoxytrityl-β-d-ribofuranosyl)-pyrimidin-2-one 1-(2′-Deoxy-β-d-ribofuranosyl)-pyrimidin-2-one (0.130g, 0.64 mmol) was co-evaporated with two portions of pyridine then dissolved in pyridine (4 ml). Next, addition of N,N-dimethylaminopyridine (0.11g, 0.992 mmol) was followed by diisopropylethylamine (0.8 ml), and dimethoxytrityl chloride (1.05 g, 3.09 mmol) and dimethylaminopyridine (1 mg). The reaction was left stirring overnight under inert atmosphere. Purification of the crude product by flash chromatography yielded 0.136 g (39%) of yellowish oil. 1H (250 MHz, CDCl3) δ 8.52 (t, 1H,H4), 8.20 (dd, 1H, H6), 7.42–7.22 (m. 8H, Ar-OCH3, Ar-OCH3), 6.82 (m, 5H, Ar), 6.30 (t, 1H, H5), 6.22 (d, 1H, H1′), 4.53 (t, 1H, H3′), 4.47 (d, 1H, H4′), 3.78 (s, 6H, Ar-OCH3), 3.18–3.13 (m, 2H, H5′, H5″), 2.25–2.18 (m, 1H, H2′), 1.94 (d, 1H, H2′). 1-(2′-Deoxy-5′-dimethoxytrityl-3′-O-2-cyanoethyl-N,Ndiisopropylphosphoramidite-β-d-ribofuranosyl)-pyrimidin-2-one 1-(2′-Deoxy-5′-dimethoxytrityl-β-d-ribofuranosyl)-pyrimidin-2-one (0.15 g, 0.309 mmol) was twice co-evaporated with pyridine then dissolved in CH2Cl2 (3 ml) under argon. Next, diisopropylchlorocyanoethylphosphoramidite (100 µl, 0.61 mmol) was added slowly and the solution was stirred for 1 h. The crude reaction was quenched with methanol and then washed with brine (5.4 M NaCl), dried with Na2SO4 and concentrated under reduced pressure. Purification by flash chromatography (1:1:1 hexanes:EtOAc:Et3N) provided a yellow foam (38 mg, 16%). 31P (101.2MHz, CDCl3) δ 151.20, 150.18 mixture of diastereomers. Oligonucleotide synthesis and purification Oligonucleotides were synthesized on an ABI 394 DNA Synthesizer. (PE Biosystems Inc., Foster City, CA). Oligonucleotides used in LDR were purified by electrophoresis on 10% polyacrylamide/7 M urea gels. Bands were visualized by UV shadowing and excised from the gel. They were eluted overnight at 64°C in TNE buffer (100 mM Tris-HCl pH 8.0, 500 mM NaCl, 5 mM EDTA). The oligonucleotides were recovered from the eluate using C18 Sep-Pak cartridges (Waters Corp., Milford, MA) following the manufacturer's instructions. Oligonucleotides were resuspended to ∼1 mM in 100 µl TE (10 mM Tris pH 8.0, 1 mM EDTA). For LDR, gelpurified stock solutions were diluted to 100 µM (100 pmol/µl). The upstream, or discriminating, oligonucleotides had fluorescent reporter groups at their 5′ termini. The fluorescent oligonucleotides were synthesized using Fam and Tet phosphoramidites (PE Biosystems Inc., Foster City, CA). The downstream, or common, oligonucleotides were phosphorylated at the 5′ end using a chemical phosphorylation reagent, and blocked at the 3′ end using a 3′-spacer C3 CPG (Glen Research, Sterling, VA). The use of chemical phosphorylation reagent is an alternative to the enzymatic techniques for oligonucleotide phosphorylation, with the advantage of allowing phosphorylation efficiency to be determined. LDR conditions LDR reactions were carried out in a 20 µl mixture containing 20 mMTris-HCl pH 7.6, 10mMMgCl2, 100mMKCl, 10mM DTT, 1 mM NAD+, 25 nM (500 fmol) of the detecting primers and mixtures of PCR products from cell lines or tumor samples. The reaction mixture was heated for 1.5 min at 94°C prior to adding 25 fmol of the wild-type or mutant Tth DNA ligase. Ligases were overproduced and purified as described previously (8,17). LDR reactions were thermally cycled for 20 cycles of 15 s at 94°C and 2 min at 65°C. Reactions were stopped by chilling the tubes in an ethanol-dry ice bath, and adding 0.5 µl of 0.5 mMEDTA. Aliquots of 2.5 µl of the reaction products were mixed with 2.5 µl of loading buffer (83% formamide, 8.3 mM EDTA and 0.17% Blue Dextran). The mixture was supplemented with 0.5 µl Rox-1000, or TAMRA 350 molecular weight marker, denatured at 94°C for 2 min, chilled rapidly on ice prior to loading on an 8 M urea-10% polyacrylamide gel, and electrophoresed on an ABI 373 DNA sequencer at 1400 V for 2 h. Fluorescent ligation products were analyzed and quantified using the ABI Gene Scan 672 software. The amount of product is calculated from a calibration curve (one fmol = 600 peak area units). Results and Discussion Detection of dinucleotide repeat polymorphisms using thermostable ligase The initial model microsatellite repeat chosen for this study is the polymorphic D6S291 sequence for which the three most frequent alleles contain CA11, CA12 and CA13 dinucleotide repeats. LDR primers were designed and synthesized for this sequence containing repeats with 11, 12 or 13 CA dinucleotides (Fig. 1, top). These primers were then ligated using synthetic templates and LDR products were analyzed on an ABI 373 DNA sequencer. The results (Fig. 1, bottom) demonstrate that ligation yielded predominantly the correct size product, but misligations on incorrect templates were up to 25% of the correct product signal. When using primers to distinguish CA18 from CA19 synthetic templates, the amount of product formed for both templates was equal regardless of the LDR primers used (data not shown). This indicates that the thermostable ligase loses its fidelity for discriminating dinucleotide slippage in repeats >36 nt in length. Table 2 summarizes the amount of misligation found when using Tth DNA ligase to detect various length mono- and dinucleotide microsatellite repeats. Table 2 View largeDownload slide Summary of misligation error rate and ligation fidelity of Tth ligase for various length microsatellite repeats Table 2 View largeDownload slide Summary of misligation error rate and ligation fidelity of Tth ligase for various length microsatellite repeats Use of nucleoside analogs to improve discrimination of dinucleotide repeat sequences A variety of primers containing nucleoside analogs were tested to determine if such modifications could help increase the specificity of ligation. The LDR primers synthesized had from five to seven dinucleotide repeat sequences and ligation fidelitywas compared to the standardized assay on CA11–CA13 templates. The various analogs tested and their special properties are listed in Table 3. Structures of the various nucleoside analogs compared with the natural bases are shown in Figure 2. Figure 1 View largeDownload slide LDR results for dinucleotide repeat sequences. (Top) Primers designed to detect CA11, CA12 and CA13 templates. The ligase reaction was carried out using 500 fmol of each primer and 50 fmol of template DNA in a 20 µl reaction with 25 fmol of Tth ligase per reaction. After ligation, the primers were separated and analyzed on an ABI 373 automated DNA sequencer. (Bottom) Ligation results for the detection of synthetic CA repeat templates using matched and mismatched LDR primer pairs. The ‘A’ columns represent lanes in which a multiplex ligation reaction was carried out using primers that could detect CA13, CA12 and CA11. The crosstalk from misligation of LDR primers, which differed by one CA unit, can be seen as off-diagonal bands. Figure 1 View largeDownload slide LDR results for dinucleotide repeat sequences. (Top) Primers designed to detect CA11, CA12 and CA13 templates. The ligase reaction was carried out using 500 fmol of each primer and 50 fmol of template DNA in a 20 µl reaction with 25 fmol of Tth ligase per reaction. After ligation, the primers were separated and analyzed on an ABI 373 automated DNA sequencer. (Bottom) Ligation results for the detection of synthetic CA repeat templates using matched and mismatched LDR primer pairs. The ‘A’ columns represent lanes in which a multiplex ligation reaction was carried out using primers that could detect CA13, CA12 and CA11. The crosstalk from misligation of LDR primers, which differed by one CA unit, can be seen as off-diagonal bands. Use of nucleoside analogs such as nitroimidazole (13) and pyrimidin-2-one in place of cytidine had previously been shown to lower the overall Tm contribution of a G:C base pair by ∼10°C/modification while at the same time maintaining the specificity of hybridization. Analogs of guanosine that have been shown to decrease duplex stability include inosine and 7-deazaguanosine (18). However, these nucleoside analogs are also capable of mispairing as a result of reduced specificity during hybridization. Preliminary studies have shown that the fidelity of ligation can be improved by incorporating nucleoside analogs such as 3-nitropyrrole two bases upstream of the ligation junction (19). In general, ligation of primers hybridized to dinucleotide microsatellite repeat sequences showed greater slippage than mononucleotide repeats (12). Analysis using modified primers indicated that misligation could be reduced on average to 15% of the correct signal (Fig. 3). Modifications which improved fidelity included increasing the ligation temperature, adding formamide to the ligation buffer, or introducing deliberate mismatches or the nitroimidazole nucleoside analog into the LDR primers. Altering location or spacing of the nitroimidazole analogs in the primers did not significantly change ligation fidelity compared to the set of nitroimidazole-containing primers shown in Table 1 (data not shown). The strongest signals, however, were obtained using primers without analogs. Tth DNA ligase could maintain high fidelity in a solution containing >10% formamide and, furthermore, ligation fidelity increased under these conditions (Fig. 3). Figure 2 View largeDownload slide Structure of nucleotide bases tested in LDR primers for ligation efficiency in dinucleotide repeats. The nucleotide bases include: A, guanine; B, 7-deazaguanine; C, adenine; D, hypoxanthine; E, cytosine; F, thymine; G, uracil; H, 5-fluorouracil; I, pyrimidin-2-one; J, 4-nitroimidazole. The bases are linked to C1′ of deoxyribose through the numbered nitrogen atom. Figure 2 View largeDownload slide Structure of nucleotide bases tested in LDR primers for ligation efficiency in dinucleotide repeats. The nucleotide bases include: A, guanine; B, 7-deazaguanine; C, adenine; D, hypoxanthine; E, cytosine; F, thymine; G, uracil; H, 5-fluorouracil; I, pyrimidin-2-one; J, 4-nitroimidazole. The bases are linked to C1′ of deoxyribose through the numbered nitrogen atom. Table 3 View largeDownload slide Table of nucleoside analogs incorporated in LDR primers to test for reduction of mismatch ligation on dinucleotide repeat targets Table 3 View largeDownload slide Table of nucleoside analogs incorporated in LDR primers to test for reduction of mismatch ligation on dinucleotide repeat targets Figure 3 View largeDownload slide Graph of average misligation error rate for dinucleotide repeats using modified oligonucleotides. Ligation reaction conditions were modified by using primers containing nucleoside analogs, mismatches or the addition of formamide to the ligation buffer. The y-axis in the figure indicates the average percentage of misligation error for CA11, CA12 and CA13 templates generated using multiple trials with each modification. The bars indicate which modifications were introduced at alternating dinucleotide repeats in the standard LDR primers (see Table 1 for sequences). Figure 3 View largeDownload slide Graph of average misligation error rate for dinucleotide repeats using modified oligonucleotides. Ligation reaction conditions were modified by using primers containing nucleoside analogs, mismatches or the addition of formamide to the ligation buffer. The y-axis in the figure indicates the average percentage of misligation error for CA11, CA12 and CA13 templates generated using multiple trials with each modification. The bars indicate which modifications were introduced at alternating dinucleotide repeats in the standard LDR primers (see Table 1 for sequences). The 7-deazaguanosine and inosine containing primers did not improve the ligation fidelity for the different repeats tested (Fig. 3). This result may be due to mispairing of these nucleoside analogs with adjacent bases on the opposite strand (18). Primers containing the nitroimidazole analog did improve the ligation fidelity for the microsatellite repeats tested. Finally, polyacrylamide gel electrophoresis demonstrated that the 2-pyrimidinone-containing primers were cleaved instead of ligating during thermocycling (data not shown). The instabililty of pyrimidin-2-one nucleosides to acidic conditions has been previously noted (20). Given the high DpKa of Tris buffer with temperature (ΔpKa = −0.033/°C) (21,22), the LDR reactions at 95°C are approximately pH 5, which is sufficient to remove the pyrimidin-2-one nucleobase leading to abasic sites and strand scission under the LDR conditions. Future studies will focus on pyrimidin-2-one analogs with electron-withdrawing groups to stabilize them to acidic hydrolysis. Overall, ligation fidelity could be increased by the introduction of nitroimidazole within the repeat region or the addition of mismatches near the ligation junction. Other conditions either diminished the total signal from ligation or increased the misligation error compared to unmodified ligation primers. Using combinations of formamide and the nitroimidazole-containing LDR primers, the misligation error was reduced to 10% of the correct signal for repeats up to 13 dinucleotides in length (data not shown). In an attempt to accentuate the differences in melting temperature between the CA repeat regions and the flanking unique sequences, primers were synthesized with long tails or propynyl groups in the flanking sequence. Even when such ligations were performed at higher temperatures and/or in the presence of formamide, these modifications decreased ligation fidelity compared with the standard conditions (data not shown). This result suggests that it is not the relative Tm values which predict ligation fidelity, but the substrate structure near the ligation junction as well as the overall length of the repeat sequences. Concluding Remarks PCR/LDR has been used extensively in the past for the analysis of disease-associated point mutations in non-repetitive DNA sequences (1,8–11). However, the capabilities of PCR/LDR for analysis of microsatellite repeat sequences were not known. In this study, the limits of using ligation for the analysis of a dinucleotide repeat were determined for the Tth DNA ligase. For dinucleotide repeats, the amount of noise due to misligation increased by >20-fold when compared to the results obtained for mononucleotide repeats (compare A10→A9 to CA13→CA12 in Table 2). This result suggests that the ligase footprint is close to the span of 26 nt of a CA13 repeat and certainly larger that the span of 10 adenosine nucleotides. The Tth DNA ligase most likely stabilizes the DNA duplex directly below its footprint, but may allow nucleotides to loopout of the double helix outside its footprint (Fig. 4). This model correlates with the observed increase in misligation errors observed above CA13 repeat length. Ligation of longer dinucleotide repeats, such as CA19 and CA18, lacked specificity altogether, and produced equal amounts of ligation product on each template. There is experimental evidence that Tth DNA ligase can also loopout a base in non-repetitive sequences at low efficiency (23). Figure 4 View largeDownload slide Schematic diagram of theoretical model for the fidelity of thermostable DNA ligase for microsatellite repeat sequences. (A) For short mononucleotide repeat sequences, the ligase footprint is large enough to span the entire repetitive region, i.e. one helical turn for an A10 repeat. (B) For intermediate length dinucleotide repeat sequences such as CA13, the LDR primers can sporadically loopout a single dinucleotide unit outside or at the border of the ligase footprint, therefore increasing the misligation rate. (C) For long dinucleotide repeat sequences (CA18 and CA19), the loopout of a single repeat unit occurs frequently and therefore eliminates specificity during ligation. Figure 4 View largeDownload slide Schematic diagram of theoretical model for the fidelity of thermostable DNA ligase for microsatellite repeat sequences. (A) For short mononucleotide repeat sequences, the ligase footprint is large enough to span the entire repetitive region, i.e. one helical turn for an A10 repeat. (B) For intermediate length dinucleotide repeat sequences such as CA13, the LDR primers can sporadically loopout a single dinucleotide unit outside or at the border of the ligase footprint, therefore increasing the misligation rate. (C) For long dinucleotide repeat sequences (CA18 and CA19), the loopout of a single repeat unit occurs frequently and therefore eliminates specificity during ligation. By using modified nucleosides within the repeat region, the misligation errors were reduced to <15% of signal obtained for the correct product. The mechanism by which a nucleoside analog can exert its effect on ligation is through the destabilization of the repeat region by lowering the Tm of the repetitive portion of the LDR primer. Due to the alteration of the stability of the repetitive tract, the ligase has more difficulty in aligning the primers at the nick. This reduces the amount of product generated from misligations when compared to primers that have stable repetitive regions. Similarly, deliberate mismatches near the ligation junction makes it more difficult for the ligase to form incorrect products resulting from misligation during LDR. An analysis of the 5700 microsatellites in the CEPH database (http://www.cephb.fr/bio/ceph-genethon-map.html) was performed and it was determined that the majority of the CA repeats contained in these mapped amplicons were greater than 20 repeat units in length. However, a group of 40 loci with heterozygosities of ranging from 0.60 to 0.80 have been identified with the longest allele consisting of CA15. Furthermore, over 1000 dinucleotide repeat sequences with at least one allele less than CA15 were found in sequences from the Whitehead Institute sequence tagged sites database. (http://www.genome.wi.mit.edu/) Such dinucleotide repeats would be amenable for analysis using ligase-based detection. The average single-nucleotide polymorphism (SNP), has a typical heterozygosity ranging between 0.40 and 0.50. To yield the analytical power of 1000 dinucleotide repeats for positional cloning studies, analysis of at least 1750 SNPs would be required. The CEPH database contains an inherent bias towards dinucleotide repeats which are greater than CA20 in length. This is a result of the screening process which utilized long CA probes and hybridization to genomic libraries to identify highly polymorphic microsatellite loci. However, as the human genome sequence is completed, many shorter dinucleotide repeat loci will be found with heterozygosities which are significantly greater than those obtained using SNPs for genetic analysis. Therefore, in the future, it is anticipated that there will be a number of loci that could be analyzed by the approach used in this study. There is also evidence that polymorphic mononucleotide repeats are commonly found in plant and mammalian genomes (24–28). These regions could also be analyzed using LDR (12). Notable microsatellite repeats which are not amenable to ligase-based detection are the extremely long triplet repeat stretches related to genes responsible for Fragile X syndrome, myotonic dystrophy and Huntington's chorea (29). An important advantage of the use of ligase-based methods for the analysis of mutations and short repeat polymorphisms is the ability to eliminate the need for gel electrophoresis. By using a DNA array, it is possible to identify multiplex LDR products by hybridization instead of gel electrophoresis. In our laboratory, a prototype universal array has been developed that detects LDR products based on addressable sequence tags attached to the LDR primers (30). This array combined with PCR/LDR analysis was able to correctly identify mutations within cell lines known to carry specific mutations in the K-ras gene (11,30). In the future, such an array should be capable of typing both single nucleotide and microsatellite repeat sequence polymorphisms at multiple loci simultaneously. This would allow for more rapid methods of human identification, loss-of-heterozygosity detection and linkage analysis. Acknowledgements The authors would like to acknowledge Norman Gerry, Reyna Favis, Weiguo Cao, Jie Tong, Joseph Day, Andrew Grace, Jie Zhou and Matt D'Alessio for their helpful suggestions and technical assistance. M.Z. was supported by the William M. Keck Fellowship and the Tri-Institutional Medical Scientist Training Program (MSTP) Fellowship during the course of this study. Support for this work was provided by the National Cancer Institute (P01-CA65930 and RO1-CA81467). References 1 Eggerding F.A., Iovannisci D.M., Brinson E., Grossman P., Winn-Deen E.S.. , Hum. Mutat. , 1995, vol. 5 (pg. 153- 165) CrossRef Search ADS PubMed 2 Weber J.L.. , Curr. Opin. Biotechnol. , 1990, vol. 1 (pg. 166- 171) CrossRef Search ADS PubMed 3 Cawkwell L., Lewis F.A., Quirke P.. , Br. J. Cancer , 1994, vol. 70 (pg. 813- 818) CrossRef Search ADS PubMed 4 Powell S.M., Zilz N., Beazer-Barclay Y., Bryan T.M., Hamilton S.R., Thibodeau S.N., Vogelstein B., Kinzler K.W.. , Nature , 1992, vol. 359 (pg. 235- 237) CrossRef Search ADS PubMed 5 Kinzler K.W., Vogelstein B.. , Cell , 1996, vol. 87 (pg. 159- 170) CrossRef Search ADS PubMed 6 Mansfield D.C., Brown A.F., Green D.K., Carothers A.D., Morris S.W., Evans H.J., Wright A.F.. , Genomics , 1994, vol. 24 (pg. 225- 233) CrossRef Search ADS PubMed 7 Ziegle J.S., Su Y., Corcoran K.P., Nie L., Mayrand P.E., Hoff L.B., McBride L.J., Kronick M.N., Diehl S.R.. , Genomics , 1992, vol. 14 (pg. 1026- 1031) CrossRef Search ADS PubMed 8 Barany F.. , Proc. Natl Acad. Sci. USA , 1991, vol. 88 (pg. 189- 193) CrossRef Search ADS 9 Barany F.. , PCR Methods Appl. , 1991, vol. 1 (pg. 5- 16) CrossRef Search ADS PubMed 10 Day D., Speiser P.W., White P.C., Barany F.. , Genomics , 1995, vol. 29 (pg. 152- 162) CrossRef Search ADS PubMed 11 Khanna M., Park P., Zirvi M., Paty P., Barany F.. , Oncogene , 1999, vol. 18 (pg. 27- 38) CrossRef Search ADS PubMed 12 Zirvi M., Nakayama T., Newman G., McCaffrey T., Ostrer H., Paty P., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 pg. e40 CrossRef Search ADS PubMed 13 Bergstrom D.E., Zhang P., Johnson W.T.. , Nucleic Acids Res. , 1997, vol. 25 (pg. 1935- 1942) CrossRef Search ADS PubMed 14 Kohler P., Volz E.. , Nucleic Acid Chem. , 1978(pg. 283- 289) 15 Gildea B., McLaughlin L.W.. , Nucleic Acids Res. , 1989, vol. 17 (pg. 2261- 2281) CrossRef Search ADS PubMed 16 Huttle R., Bucheke F.. , Chem. Ber. , 1955, vol. 88 (pg. 1586- 1590) CrossRef Search ADS 17 Barany F., Gelfand D.. , Gene , 1991, vol. 109 (pg. 1- 11) CrossRef Search ADS PubMed 18 Brown D.M., Lin P.K.. , Carbohydr. Res. , 1991, vol. 216 (pg. 129- 139) CrossRef Search ADS PubMed 19 Luo J., Bergstrom D.E., Barany F.. , Nucleic Acids Res. , 1996, vol. 24 (pg. 3071- 3078) CrossRef Search ADS PubMed 20 Iocono J.A., Gildea B., McLaughlin L.W.. , Tetrahedron Lett. , 1990, vol. 31 (pg. 175- 178) CrossRef Search ADS 21 Day J., Bergstrom D., Hammer R., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 1810- 1818) CrossRef Search ADS PubMed 22 Day J., Hammer R., Bergstrom D., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 1819- 1827) CrossRef Search ADS PubMed 23 Tong J., Cao W., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 788- 794) CrossRef Search ADS PubMed 24 Aitman T.J., Hearne C.M., McAleer M.A., Todd J.A.. , Mamm. Genome , 1991, vol. 1 (pg. 206- 210) CrossRef Search ADS PubMed 25 Ellegren H.. , Animal Genet. , 1993, vol. 24 (pg. 367- 372) CrossRef Search ADS 26 Powell W., Morgante M., Andre C., McNicol J.W., Machray G.C., Doyle J.J., Tingey S.V., Rafalski J.A.. , Curr. Biol. , 1995, vol. 5 (pg. 1023- 1029) CrossRef Search ADS PubMed 27 Provan J., Corbett G., Waugh R., McNicol J.W., Morgante M., Powell W.. , Proc. R. Soc. Lond. Ser. B Biol. Sci. , 1996, vol. 263 (pg. 1275- 1281) CrossRef Search ADS 28 Provan J., Corbett G., McNicol J.W., Powell W.. , Genome , 1997, vol. 40 (pg. 104- 110) CrossRef Search ADS PubMed 29 Schalling M., Hudson T.J., Buetow K.H., Housman D.E.. , Nature Genet. , 1993, vol. 4 (pg. 135- 139) CrossRef Search ADS 30 Gerry N., Witowski N., Barany G., Barany F.. , J. Mol. Biol. , 1999, vol. 292 (pg. 251- 262) CrossRef Search ADS PubMed © 1999 Oxford University Press
A multiple-capillary electrophoresis system for smallscale DNA sequencing and analysisZhang, Jianzhong;Voss, Karl O.;Shaw, Diana F.;Roos, K. Pieter;Lewis, Darren F.;Yan, Juying;Jiang, Rong;Ren, Hongji;Hou, Joan Y.;Fang, Yu;Puyang, Xiaoling;Ahmadzadeh, Hossein;Dovichi, Norman J.
doi: 10.1093/nar/27.24.e36pmid: 10572188
Abstract A five-capillary system has been developed for DNA sequencing and analysis. The post-column fluorescence detector is based on a sheath-flow cuvette. The instrument provides uniform and continuous illumination of the samples. The cuvette virtually eliminates cross-talk in the fluorescence signal between capillaries. Discrete single-photon counting avalanche photodiodes provide high efficiency light detection. The instrument has detection limits (3σ) of 130 ± 30 fluorescein molecules injected onto each capillary. Over 650 bases of sequence at 98.8% accuracy were generated in 100 min at 50°C from M13mp18. Separation and detection of short tandem repeats proved efficient and accurate with the use of internal standards for direct comparison of migration times between capillaries. Introduction Large-scale DNA sequencing projects require instruments that generate high throughput and high sequencing accuracy at low cost (1). Capillary electrophoresis provides low-cost, easily automated and rapid DNA sequencing (2–15). The first multiple-capillary instrument was reported in 1990. Zagursky modified a commercial DuPont Genesis 2000 sequencer to operate with 500-µm ID capillaries (16). In that instrument, an argon ion laser beam was scanned across the capillary array. The instrument operated at 50 V cm−1; 9.5 h were required to separate fragments 500 bases in length. Sequencing accuracy was <97% for fragments ranging from 29 to 512 bases in length. Mathies reported a similar scanning instrument to image a capillary array (17). That instrument operated with 100-µm ID capillaries and produced sequencing information up to 320 bases in length. Duty cycle is an important parameter in specifying a detector's performance. In scanning systems, the optical system probes each capillary in sequence. Duty cycle is the fraction of time that a sample is illuminated. Duty cycle is important because DNA fragments migrate from the capillary undetected during the period when a capillary is not illuminated. In scanning systems, the duty cycle decreases in proportion to the number of capillaries. In contrast, several systems have been developed to continually monitor fluorescence from a capillary array; these systems inherently have a much higher duty cycle than scanning systems. Yeung's group reported a multiple-capillary DNA sequencer in which a ribbon of capillaries was illuminated with a line-focused laser beam. Fluorescence was collected at right angles and imaged onto a CCD camera. The use of the CCD camera ensured that all capillaries were monitored simultaneously (18,19). Similarly, an eight-capillary DNA sequencer was reported based on the use of individual fiber-optics to deliver a laser beam to each capillary (20). A discrete laser beam is required to excite fluorescence from each sample. A second set of fibers transmits fluorescence to an imaging spectrograph and CCD detector. The sequence from 400 to 450 bases was generated in 1 h. These designs are relatively inefficient in their use of excitation light. For example, if 10 mW of light is required to excite fluorescence from each capillary, then 1 W of light is required to excite fluorescence from 100 capillaries. In a further improvement, both Yeung and Kambara have reported a capillary array approach with side-illumination, on-column fluorescence detection (21,22). These instruments provide continual illumination of the samples with one laser beam, providing a much higher duty cycle compared to a scanning instrument and requiring a much lower laser power than used in the line-focused or fiber-optic excited systems. The on-column detection scheme is useful for small numbers of capillaries but appears to be difficult to scale to larger arrays. Kambara showed another approach by applying post-column fluorescence detection in a sheath-flow cuvette (23,24). Sequencing length of 303 bases was achieved in 111 min. Our research group reported the first capillary electrophoresis instrument based on post-column fluorescence detection in a sheath-flow cuvette (25). These detectors, borrowed from the optical chamber used in flow cytometry, provide very low background and very high sensitivity fluorescence detection, which allows the detection of individual fluorescent molecules migrating from a capillary electrophoresis column (26). We first reported the use of the sheath-flow cuvette for separation of single-base termination DNA sequencing fragments by capillary electrophoresis in 1990 (6). The system was expanded to four-color operation in 1991 (9). We reported a single-capillary electrophoresis instrument that operates at elevated temperatures with non-crosslinked polyacrylamide (15). The 0%C polymer has low viscosity and may be pumped from the capillary and replaced with fresh material after each run. Sequencing fragments over 640 baseswere separated in 2 h at an electric field of 150 V cm−1 and at a temperature of 60°C. This report was the first description of high temperature separation of DNA sequencing fragments in non-crosslinked polymer. The high temperature operation increased sequencing rate, decreased compression, and increased the sequencing read length compared to room temperature sequencing. Figure 1 View largeDownload slide Four-spectral-channel laser-induced fluorescence detection. The sector wheel alternately transmits the two laser beams and the filter wheel synchronizes the transmission of fluorescence in four bands chosen to match the emission spectra of the four dyes. Figure 1 View largeDownload slide Four-spectral-channel laser-induced fluorescence detection. The sector wheel alternately transmits the two laser beams and the filter wheel synchronizes the transmission of fluorescence in four bands chosen to match the emission spectra of the four dyes. In this paper, we report a five-capillary version of the high-efficiency electrophoresis system (27). The instrument provides high sensitivity, which is important for the detection of the very small amount of sample loaded onto the capillary. The instrument also can operate at high temperature, which minimizes the formation of compressions, and produces faster separations and longer read-length (15). This instrument fills a niche between the single-capillary Applied Biosystems 310 and the larger-scale Applied Biosystems 377 slab-gel system and the Applied Biosystems 3700 capillary array instrument, and should be of interest to small-scale sequencing laboratories. Materials and Methods Instrumental design The overall scheme for the instrument is shown in Figure 1 (27). Two lasers, a 1-mW helium-neon laser operating in the green at 543.5 nm and a 4-mW argon ion laser operating in the blue at 488 nm, were alternately chopped with a sector wheel to provide sequential excitation. The beams were recombined with a dichroic beam splitter and focused with a 1× microscope objective into the locally designed multiple-capillary sheath-flow cuvette. Figure 2 View largeDownload slide The sheath-flow cuvette fluorescence detection chamber for an array of five capillaries. The chamber is tapered, being 50 µm wider at the top than at the bottom. A single laser beam is used to illuminate fluorescence from the five sample streams isolated by the sheath flow fluid. Figure 2 View largeDownload slide The sheath-flow cuvette fluorescence detection chamber for an array of five capillaries. The chamber is tapered, being 50 µm wider at the top than at the bottom. A single laser beam is used to illuminate fluorescence from the five sample streams isolated by the sheath flow fluid. The rectangular sheath-flow cuvette (Fig. 2) was constructed from high quality quartz by NSG Precision Cells. The cuvette had a 150 µm × 750 µm inner chamber, four 1-mm thick windows and a height of 2 cm. The two narrow walls of the chamber were tapered so that their spacing was 50 µm wider at the top than at the bottom. The taper forced the capillaries to be squeezed together as they were inserted into the cuvette. Sheath fluid was pumped through the interstitial space between the capillaries. Either 10 mM borate (for free-zone electrophoresis) or 1× TBE (for gel electrophoresis) was used as the sheath fluid. This buffer drew the sample from the capillaries, creating a set of independent sample streams, one sample stream centered beneath each capillary. Fluorescence from each sample stream was excited simultaneously by the laser beam that was focused ∼100 µm below the ends of the capillaries. A microscope objective (20× and 0.5 NA) collected fluorescence. A filter wheel was located between the collection optic and the detectors. This filter wheel was equipped with four 1- inch-diameter interference filters. For sequencing, the bandpass of the filters was centered at 540, 560, 580 and 610 nm with a 10-nm bandwidth. For short tandem repeat analysis, the bandpass was centered at 530, 545, 560 and 580 nm. The sector wheel, which alternately transmitted the blue and green laser beams, was synchronized with the filter wheel by use of stepper motors driven by a common controller. Excitation at 488 nm was synchronized with detection at the two shorter wavelength filters. Excitation at 543.5 nm was synchronized with detection at the two longer wavelengths. The filter wheel was equipped with sensors to signal the identity of the filter in the optical path. It was not possible to image the fluorescence directly onto the photodetectors; their relatively large size was incompatible with the spacing of the fluorescence image generated by the collection optic. Instead, the fluorescence was imaged onto a set of five gradient refractive index (GRIN) lenses. These GRIN lenses were 1.8-mm diameter, 0.25 pitch and arranged at 3-mm center spacing; there was one GRIN lens per fluorescent spot. Each GRIN lens coupled the fluorescence to a fiber optic, which was pig-tailed to an avalanche photodiode (APD). These photodiodes operated in the single-photon-counting mode and provided low dark count, high quantum efficiency photodetection. The output of the single-photon-counting modules consisted of a train of pulses. The pulse train was converted to a voltage with a frequency-to-voltage converter, which was monitored with an analog-to-digital board. A Macintosh Quadra 700 was used for data acquisition. Capillary electrophoresis A 30-kV power supply was used to drive electrophoresis. The samples or running buffers were encased in a Plexiglas box equipped with a safety interlock. The capillaries were held in a locally constructed heater based on Peltier devices and a proportional temperature controller, which held the temperature of the capillaries with an accuracy of ± 0.5°C. The sheath flow was provided either by a precision syringe pump or by simply siphoning with the height difference of 5–2 cm between sheath flow reservoir and waste container. Sheath flow rate was typically 0.3 ml h−1. Reagents and solutions Fluorescein was a high purity standard from Molecular Probes. Borax, boric acid and EDTA were analytical-reagent grade. Tris base and urea were ultra pure reagents. Acrylamide and N,N,N′,N′-tetramethylethylenediamine (TEMED) were electrophoresis-purity reagents. Ammonium persulphate was ultra pure electrophoresis grade. [γ-(methacryloxy)propyl]trimethoxysilane was reagent grade. A stock borate solution was prepared by dissolving 0.478 g of borax in 50 ml water. It was diluted to a final concentration of 10 mM borate (pH 9.2). A stock solution of ∼1 mM fluorescein was made by dissolving fluorescein in ethanol. A series of four concentrations of fluorescein were prepared from 2 × 10−11 to 2 × 10−12 M by diluting the stock fluorescein solution with 10 mM borate. A 1% [γ-(methacryloxy)propyl]trimethoxysilane solution was prepared by adding 10 µl of [γ-(methacryloxy)propyl]trimethoxysilane in 990 µl 95% ethanol-5% water solution (pH 4.8). A stock 10× TBE (pH 8.3) solution was prepared by dissolving 108 g Tris, 55 g boric acid and 40ml of 0.5MEDTAin water to a final volume of 1 liter. Capillary polymer preparation We coated the capillaries using a method reported earlier (15). By using a water aspirator, the capillaries were first filled with 1% [γ-(methacryloxy)propyl]trimethoxysilane solution; after 20 min, the silane solution was replaced with degassed 5% acrylamide, 7 M urea, 0.07% (w/v) ammonium persulphate and 0.07% (v/v) TEMED solution prepared in a TBE buffer (1× TBE final). The polymerization reaction was allowed to proceed overnight. The 5%T polyacrylamide solution-filled capillary was subjected to a 30-min pre-run before sample injection. The four-color sequencing reaction products were separated using a non-crosslinked dimethylacrylamide polymer solution. The polymer was prepared from 6% dimethylacrylamide, 7 M urea, 0.07% (w/v) ammonium persulphate and 0.07% (v/v) TEMED solution prepared in a TBE buffer (1× TBE final) under an argon atmosphere. The polymerized solution was loaded into a 10-ml syringe and stored at 4°C before use. The polymerized solution was injected into the capillary using a homemade manifold system. The separation was performed at 50°C, the capillary length was 60 cm and the electric field was 185 V cm−1. Sequencing sample preparation and injection All sequencing templates were M13mp18. The four-color sample was a cycle-sequencing product (15). A Sequitherm Long-Read Cycle Sequencing kit was used for cycle sequencing with fluorescently labeled primers. Cycle sequencing reactions were carried out according to the recommended protocols of the manufacturer. Cycle sequencing was performed using 30 cycles on a PRT-100 Programmable Thermal Controller equipped with a hot bonnet without oil; each cycle consisted of 15 s at 95°C and 90 s at 70°C. Reaction products were pooled and immediately ethanol precipitated. The dried pellet was resuspended in 1.5 µl of formamide. The products were injected at 100 V cm−1 for 40 s. Short tandem repeat (STR) analysis Human blood samples were collected and DNA extracted. Polymerase chain reactions (PCR) were performed using 25 ng genomic DNA. Each PCR contained dNTPs (2 mM each dATP, dCTP, dGTP and dTTP), 5× N buffer (50 mM KCl, 10 mM Tris-HCl pH 8.3, 170 µ ml−1 bovine serum albumin, 0.05% Tween 20, 0.05% NP-40 and 1.5 mMMgCl2) and 0.5 U Taq polymerase. STR primers were ordered from Research Genetics or Integrated DNA Technologies Inc., and were fluorescently labeled with 6-FAM, TET or HEX. The total PCR volume was 15 µl. The mixture was first subjected to an initial denaturing step of 94°C for 1 min. PCR was performed for 30 cycles consisting of 94°C for 1 min, 56°C for 2 min and 72°C for 1 min. The mixture was then treated to a final elongation step of 72°C for 7 min. PCR products were purified with Microcon-30 columns to remove excess salts and primers according to the manufacturers' instructions. The final elution was in formamide. TAMRA-labeled Genescan-500 sizing ladder was added to the purified PCR products and samples were denatured at 94°C for 2 min immediately prior to injection. The PCR products were separated by use of a denaturing 7% linear polyacrylamide solution in 1× TBE buffer with 6 M urea. The polymer was prepared as described above. The fused-silica capillaries were 43 cm long, 140 µm OD and 50 µm ID. Electrokinetic sample injection was 100 V cm−1 for 30 s. The sheath flow buffer was 1× 45°C. Subsequent to the run, data were analyzed using Igor Pro and MatLab. Figure 3 View largeDownload slide (A) Image of the GRIN lens array. The image was a back-illumination of the optical system. (B) Superimposed image of the fluorescence spots and back-illuminated spots seen through an auxiliary microscope placed opposite the sheath-flow cuvette from the collecting optic. Capillaries were 50 µm ID, 150 µm OD and 37.0 cm long, filled with 10 mM borate, pH 9.2. Fluorescein concentration was 10−7 M. The argon ion laser power was 4.0 mW. Green spots are fluorescence from sample streams, while the brown spots are scattering from the capillary tips. Figure 3 View largeDownload slide (A) Image of the GRIN lens array. The image was a back-illumination of the optical system. (B) Superimposed image of the fluorescence spots and back-illuminated spots seen through an auxiliary microscope placed opposite the sheath-flow cuvette from the collecting optic. Capillaries were 50 µm ID, 150 µm OD and 37.0 cm long, filled with 10 mM borate, pH 9.2. Fluorescein concentration was 10−7 M. The argon ion laser power was 4.0 mW. Green spots are fluorescence from sample streams, while the brown spots are scattering from the capillary tips. Data processing A simple base-calling algorithm was used to analyze the data. The routine was written in MatLab and run on a G3 power Macintosh and will be described in detail elsewhere. The routine has several components. The data were first convoluted through a Gaussian filter and then each of the traces was baseline corrected. Next, a response matrix was constructed based on the relative fluorescence intensities in each spectral channel. The data were multiplied by the inverse of this spectral response matrix to convert from spectral-space to dyespace (28). A mobility shift routine was incorporated to accommodate differences in mobilities for fragments labeled with the different dyes. Local maxima were identified and sequence was called based on the maximum dye response. Peak area was also calculated and used to identify and resolve multiplets late in the run. Results and Discussion Hydrodynamics In our sheath-flow cuvette, an array of capillaries is snugly fit into a rectangular quartz flow chamber. A simple siphon pumps the fluid through the interstitial spaces between the capillaries and draws the analyte streams, one per capillary, in the open region below the capillaries. A single laser beam excites fluorescence from all sample streams simultaneously. The highly transparent sheath fluid and the vanishingly low concentration of the DNA sample produces a negligible attenuation of the laser beam across the cuvette. Figure 3 presents a photograph of the sample streams. Each spot is ∼50 µm in diameter, which equals the inner diameter of the capillaries. The spots are uniformly spaced by ∼150 µm, which equals the outer diameter of the capillaries. The fluorescent spots are well separated in the photograph, generating negligible cross-talk. The design of a successful multiple-capillary sheath-flow cuvette requires careful attention to hydrodynamic focusing. In particular, it is necessary to have uniform sheath flow between each capillary. An unbalanced flow will cause the sample streams to deflect towards the region of lower flow velocity. This deflection of the sample stream results in misalignment with the optical system and can result in the failure to record a signal from that capillary. Uniform hydrodynamic flow is achieved if the capillaries are uniformly spaced within the cuvette.While it is possible to use micromachined cuvettes to hold the capillaries on uniform centers, our five-capillary instrument employs a somewhat simpler design that ensures uniform capillary spacing. The capillaries are inserted into a rectangular sheath flow cuvette. The narrow dimension of the cuvette is matched to the outer diameter of the capillary (Fig. 2). The inner walls of the cuvette are slightly tapered so that the top of the cuvette is slightly wider than the sum of the capillaries' diameters while the bottom of the cuvette is slightly narrower than that distance. As a result, the capillaries are squeezed together as they are inserted into the cuvette; since the capillaries are in contact, their spacing is very uniform, as are the sheath flow and the sample streams. Optics Microscope objectives are efficient collection optics for fluorescence detection in capillary electrophoresis (29). In our system, a 20 × 0.5 numerical aperture microscope objective, which provides a collection efficiency of 6.7%, collects fluorescence from the sheath flow cuvette. With 50-µm ID and 150-µm OD capillaries, the objective produces an image that consists of 1-mm diameter fluorescence spots at 3-mm spacing. A set of single-photon counting avalanche photodiodes is used for fluorescence detection. These very rugged devices provide extremely high quantum efficiency (>50%) and reasonable dynamic range. The APDs are housed in rather bulky containers, which contain low-noise amplifiers, Peltier coolers and single-photon detecting electronics. The APDs are too bulky to be used directly to detect the fluorescence images from the capillary array; we cannot pack them closely enough to simultaneously monitor fluorescence from each capillary. A set of five GRIN-lenses, coupled to fiber optics, is used to transmit fluorescence from the image-plane of the microscope objective to the APDs. The core of the optical fiber is only 100 µm in diameter, which is much smaller than the 1-mm fluorescence spot. We use a set of GRIN lenses, 1.8 mm diameter and 0.25 pitch, at the image plane at 3-mm center spacing to couple fluorescence into the optical fibers (Fig. 2). These inexpensive, compact optical elements efficiently couple fluorescence into the optical fibers. The use of optical fibers-GRIN lenses has proven to be valuable in alignment of the system. The optical fibers can be disconnected from the APDs. The detection end is illuminated with a lamp, creating back-illumination of the optical system. When viewed through an alignment microscope placed on the opposite side of the sheath flow cuvette from the collection optics, the illuminated optical fibers transmit light through the GRIN lenses to the 20× microscope objective into the sheath-flow cuvette (Fig. 3). Alignment is achieved by flowing dilute fluorescent dye through the capillaries; the relative position of the cuvette and the laser beam are adjusted until the fluorescent spots from the dye and the back-illuminated spots from the GRIN lenses are superimposed. The optical fibers are then reconnected to the APDs and a final tweaking of the optical system is performed to maximize the signal from the APDs. Detection limits The limit of detection was evaluated in free-zone-electrophoresis mode. The 37.0-cm capillaries were filled with a 10 mM borate buffer. Electrophoresis was performed with an electric field of 300 V cm−1 across the capillaries. A 1.1-nl plug of fluorescein was injected electrokinetically (1.0 kV, 5 s). Figure 4 presents an electropherogram of a 2 × 10−12M solution of fluorescein (2.2 zeptomol or 1300 molecules injected). The five traces were recorded simultaneously and presented as photon counts per 200 ms window. Migration of the fluorescein solution from the capillary generated one peak per capillary. The difference in migration time reflects the differences in electro-osmotic flow between the capillaries; the capillary walls were not coated for this experiment. The average peak area corresponds to 12 000 photons above the background signal level; each molecule generated an average of nine detected photons. Blank injections were performed by dipping the capillaries into the dye solution without application of injection potential; no peaks were observed from these blanks. Detection limits (3σ) were 130 ± 30 molecules (2 × 10−13 M) injected onto the capillaries (30). These detection limits reflect the good light collection efficiency of the optical system, the low background signal generated in the sheath-flow cuvette, and the high quantum efficiency and low-noise of the APDs. Figure 4 also demonstrates the other important feature of the design—the fluorescence detection is free of cross-talk; a peak in one capillary did not generate a signal in an adjacent capillary. The sheath flow not only lowered the background in fluorescence detection, but also provided excellent physical isolation for the separation channels even when the capillaries were in contact. Figure 4 View largeDownload slide Injection of 1300 fluorescein molecules. The capillaries were 50 µm ID, 150 µm OD and 37.0 cm long, filled with 10 mM borate, pH 9.2. Fluorescein, 2 × 10−12 M, was injected at 1 kV for 5 s. The electrophoresis was conducted at an electric field strength of 300 V cm−1. Argon ion laser power was 4.0 mW at 488 nm. Each data point was a 0.2 s count. The data were subjected to a binomial smoothing before plotting. Figure 4 View largeDownload slide Injection of 1300 fluorescein molecules. The capillaries were 50 µm ID, 150 µm OD and 37.0 cm long, filled with 10 mM borate, pH 9.2. Fluorescein, 2 × 10−12 M, was injected at 1 kV for 5 s. The electrophoresis was conducted at an electric field strength of 300 V cm−1. Argon ion laser power was 4.0 mW at 488 nm. Each data point was a 0.2 s count. The data were subjected to a binomial smoothing before plotting. DNA sequencing at 50°C By incorporating two-laser-line excitation and four-spectral-channel detection (9) into the five-capillary system, we turned the five-capillary instrument into a modest throughput, high-performance DNA sequencer. Figure 5 shows a typical sequencing separation performed at 50°C at a moderate electric field strength of 185 V cm−1 in a capillary filled with 6% non-cross-linked polydimethylacrylamide. The sequencing accuracy was 98.8% for 650 bases when using a simple base-calling algorithm written using MatLab. The software performance was limited by difficulties in handling multiplets late in the run; all errors were associated with an inaccurate estimate of these multiplets. Clearly, improved software will result in improved sequencing accuracy. Microsatellite analysis Markers on chromosome 7 were chosen to test the application of microsatellite methodology to capillary electrophoresis. The chromosome is ∼184 cM (sex-averaged) in genetic length; five microsatellite markers (D7S479, D7S500, D7S501, D7S523 and D7S554) were used to test the technology. These markers cover almost half of the long arm of the chromosome. Generally, markers spaced at ∼20 cM intervals allow the detection of linkage to a distance of 10 cM on either side of any putative disease-causing gene. Figure 6 presents five loci for a child and its parents, along with the signal from a commercial size standard. Although the D7S479 locus generated significant stutter bands, the patterns are clearly resolved and identification of the STR pattern is trivial. Figure 5 View largeDownload slide DNA sequencing run of an M13mp18 sample. The separation was performed in 6% non-crosslinked polydimethylacrylamide at 50 °C at an electric field strength of 185 V cm−1. The base-calls were performed using an algorithm written in MatLab. The called sequence is given above each peak. Errors are noted beneath the called sequence. Figure 5 View largeDownload slide DNA sequencing run of an M13mp18 sample. The separation was performed in 6% non-crosslinked polydimethylacrylamide at 50 °C at an electric field strength of 185 V cm−1. The base-calls were performed using an algorithm written in MatLab. The called sequence is given above each peak. Errors are noted beneath the called sequence. The microsatellite allele sizes for each family member determined by capillary electrophoresis were compared to those produced using traditional radioactive labeling and slab gel separation. The allele sizes generated by the two techniques corresponded (data not shown), indicating that DNA fragment sizes can be accurately determined and directly compared with the use of an internal standard. Separation of DNA fragments by capillary electrophoresis was rapid, with DNA of 500 bp length detected within 200 min with single base pair resolution. Conclusions We describe a five-capillary DNA sequencer based on a sheath-flow cuvette. The instrument uses avalanche photodiodes and a sheath-flow cuvette to produce extremely high detection sensitivity, which is important when analyzing small amounts of fluorescently labeled DNA. The instrument can operate at 50°C, which is valuable in reducing compressions in DNA sequencing. The instrument can also be used for genetic mapping, where the use of fluorescently labeled size markers facilitates comparison of genotyping patterns between individuals. The instrument currently relies on manual refilling of the capillaries with sequencing matrix between runs. Roughly 2.5 h were required to refill the capillaries and analyze the next sample. This turnaround time would be improved with an automated capillary refilling system, such as that found on commercial instruments. The instrument fills a niche between single-capillary DNA sequencers and 96-capillary DNA sequencers. We have also constructed 16- and 32-capillary versions of this instrument, which will be described elsewhere (H.J.Crabtree, S.J.Bay, D.Lewis, L.Coulson, G.Fitzpatrick, D.J.Harrison, S.Delinger, J.Z.Zhang, and N.J.Dovichi, paper submitted). Figure 6 View largeDownload slide STR analysis of five loci from a child and its parents. The blue peaks are size-markers that were used to align the patterns. The data were treated with a color-inversion matrix to correct for spectral overlap between the dyes. Data from the migration period containing each locus is plotted in the five panels. Figure 6 View largeDownload slide STR analysis of five loci from a child and its parents. The blue peaks are size-markers that were used to align the patterns. The data were treated with a color-inversion matrix to correct for spectral overlap between the dyes. Data from the migration period containing each locus is plotted in the five panels. Acknowledgements This work was supported by the Canadian Genetic Diseases Network, the Canadian Bacterial Diseases Network, the Natural Sciences and Engineering Research Council of Canada (NSERC) and Sciex. K.V. acknowledges a graduate fellowship from the Alberta Heritage Foundation for Medical Research. References 1 Hunkapiller T., Kaiser R.J., Koop B.F., Hood L.. , Science , 1991, vol. 254 pg. 67 CrossRef Search ADS 2 Dovichi N.J.. Landers E.. , CRC Handbook of Capillary Electrophoresis , 1990 Boca Raton, FL CRC Press(pg. 369- 387) 3 Swerdlow H., Gestland R.. , Nucleic Acids Res , 1990, vol. 18 (pg. 1415- 1419) CrossRef Search ADS PubMed 4 Drossman H., Luckey J.A., Kostichka A.J., D'Cunha J., Smith L.M.. , Anal. Chem. , 1990, vol. 62 (pg. 900- 903) CrossRef Search ADS PubMed 5 Cohen A.S., Najarian D.R., Karger B.L.. , J. Chromatogr. , 1990, vol. 516 (pg. 49- 60) CrossRef Search ADS PubMed 6 Swerdlow H., Wu S., Harke H., Dovichi N.J.. , J. Chromatogr. , 1990, vol. 516 (pg. 61- 67) CrossRef Search ADS PubMed 7 Chen D., Swerdlow H.P., Harke H.R., Zhang J., Dovichi N.J.. , J. Chromatogr. , 1991, vol. 559 (pg. 237- 246) CrossRef Search ADS PubMed 8 Karger A.E., Harris J.M., Gestland R.F.. , Nucleic Acids Res. , 1991, vol. 19 (pg. 4955- 4962) CrossRef Search ADS PubMed 9 Swerdlow H., Zhang J., Chen D., Harke H.R., Grey R., Wu S., Dovichi N.J.. , Anal. Chem. , 1991, vol. 63 (pg. 2835- 2841) CrossRef Search ADS PubMed 10 Rocheleau M.J., Grey R.J., Chen D.Y., Harke H.R., Dovichi N.J.. , Electrophoresis , 1992, vol. 13 (pg. 484- 486) CrossRef Search ADS PubMed 11 Chen D., Harke H.R., Dovichi N.J.. , Nucleic Acids Res. , 1992, vol. 20 (pg. 4873- 4880) CrossRef Search ADS PubMed 12 Harke H.R., Bay S., Zhang J.Z., Rocheleau M.J., Dovichi N.J.. , J. Chromatogr. , 1992, vol. 608 (pg. 143- 150) CrossRef Search ADS PubMed 13 Rocheleau M.J., Dovichi N.J.. , J. Microcol. Sep. , 1992, vol. 4 (pg. 449- 453) CrossRef Search ADS 14 Luckey J.A., Smith L.M.. , Anal. Chem. , 1993, vol. 65 (pg. 2841- 2850) CrossRef Search ADS PubMed 15 Zhang J., Fang Y., Hou J.Y., Ren H., Jiang R., Roos P., Dovichi N.J.. , Anal. Chem. , 1995, vol. 67 (pg. 4589- 4593) CrossRef Search ADS PubMed 16 Zagursky R.J., McCormick R.M.. , Biotechniques , 1990, vol. 9 (pg. 74- 79) PubMed 17 Mathies R.A., Huang X.C.. , Nature , 1992, vol. 259 (pg. 167- 169) CrossRef Search ADS 18 Taylor J.A., Yeung E.A.. , Anal. Chem. , 1993, vol. 65 (pg. 956- 960) CrossRef Search ADS 19 Ueno K., Yeung E.S.. , Anal. Chem. , 1994, vol. 66 (pg. 1424- 1431) CrossRef Search ADS 20 Quesada M.A., Zhang S.. , Electrophoresis , 1996, vol. 17 (pg. 1841- 1851) CrossRef Search ADS PubMed 21 Lu X., Yeung E.S.. , Appl. Spectrosc. , 1995, vol. 49 (pg. 605- 609) CrossRef Search ADS 22 Anazawa T., Takahashi S., Kambara H.. , Anal. Chem. , 1996, vol. 68 (pg. 2699- 2704) CrossRef Search ADS PubMed 23 Kambara H., Takahashi S.. , Nature , 1993, vol. 361 (pg. 565- 566) CrossRef Search ADS PubMed 24 Takahashi S., Murakami K., Anazawa T., Kambara H.. , Anal. Chem. , 1994, vol. 66 (pg. 1021- 1026) CrossRef Search ADS 25 Cheng Y.F., Dovichi N.J.. , Science , 1988, vol. 242 (pg. 562- 564) CrossRef Search ADS PubMed 26 Chen D.Y., Dovichi N.J.. , Anal. Chem. , 1996, vol. 68 (pg. 690- 696) CrossRef Search ADS 27 Zhang J.Z.. , Ph.D. Thesis , 1994 University of Alberta 28 Giddings M.C., Brumley R., Haker M., Smith L.M.. , Nucleic Acids Res. , 1993, vol. 21 (pg. 4530- 4540) CrossRef Search ADS PubMed 29 Wu S., Dovichi N.J.. , J. Chromatogr. , 1989, vol. 48 (pg. 141- 155) CrossRef Search ADS 30 Knoll J.E.. , J. Chromatogr. Sci. , 1985, vol. 1985 (pg. 422- 425) CrossRef Search ADS © 1999 Oxford University Press
Improved fidelity of thermostable ligases for detection of microsatellite repeat sequences using nucleoside analogsZirvi, Monib; Barany, Francis; Bergstrom, Donald E.; Saurage, Andrea S.; Hammer, Robert P.
doi: N/Apmid: N/A
Microsatellite repeats consisting of dinucleotide sequences are ubiquitous in the human genome and have proven useful for linkage analysis, positional cloning and forensic identification purposes. In this study, the potential of utilizing the ligase detection reaction for the analysis of such microsatellite repeat sequences was investigated. Initially, the fidelity of thermostable DNA ligases was measured for model dinucleotide repeat sequences. Subsequently, the effect of modified oligonucleotides on ligation fidelity for dinucleotide repeats was determined using the nucleoside analogs nitroimidazole, inosine, 7-deazaguanosine and 2-pyrimidinone, as well as natural base mismatches. The measured error rates for a standard dinucleotide template indicated that the nitroimidazole nucleoside analogs could be used to increase the fidelity of ligation when compared to unmodified primers. Furthermore, use of formamide in the ligation buffer also increased ligation fidelity for dinucleotide repeat sequences. Using ligation-based assays to detect polymorphic alleles of microsatellite repeats in the human genome opens the possibility of using array-based typing of these loci for human identification, loss-of-heterozygosity studies and linkage analysis.
High-throughput plasmid DNA purification for 3 cents per sampleMarra, Marco A.;Kucaba, Tamara A.;Hillier, LaDeana W.;Waterston, Robert H.
doi: 10.1093/nar/27.24.e37pmid: 10572189
Abstract To accommodate the increasingly rapid rates of DNA sequencing we have developed and implemented an inexpensive, expeditious method for the purification of double-stranded plasmid DNA clones. The robust nature, high throughput, low degree of technical difficulty and extremely low cost have made it the plasmid DNA preparation method of choice in both our expressed sequence tag (EST) and genome sequencing projects. Here we report the details of the method and describe its application in the generation of more than 700 000 ESTs at a rate exceeding 16 000 per week. Introduction While cost-effective and robust methods for purification of cloned DNA from bacteria impact the success of any DNA sequencing effort, they are crucial for high-throughput, largescale projects where many thousands of DNA purifications and subsequent sequencing reactions are performed daily. These activities require methods, easily implemented, that are capable of the required throughput, have a low level of technical difficulty, yield DNA in reproducible amounts, and are applicable to different bacterial host/vector combinations. As part of ongoing efforts to improve the double-stranded plasmid DNA purification process at our Genome Sequencing Center (1), we explored different commercial and in-house methods, evaluating each with respect to the factors listed above. These investigations led to the development of the DNA purification method described here; first implemented in a production mode in October of 1997. Since that time, a team of four technicians has used it to prepare more than 700 000 DNA samples, with current throughput exceeding 16 000 preparations each week. The method, suitable for fueling high-throughput sequencing projects, is described here in detail. Materials and Methods 96-well microwave protocol Two studies in the literature reported the isolation of plasmid DNA from bacterial cells by treatment with lysozyme followed by a brief exposure to microwave radiation. One study (2) used 1 ml aliquots of bacterial cultures grown to saturation in 50 ml culture volumes. Bacterial cell pellets, collected in individual microfuge tubes, were resuspended in a lysozyme-containing solution. Exposure to microwave radiation was performed, and the resulting DNA purified by isopropanol precipitation. The authors noted that DNA prepared by their protocol was of quality suitable for restriction digests and other unspecified enzymatic modifications, but the suitability of the DNA for sequencing was not reported. Another study (3) reported the use of microwave radiation in the isolation of plasmid DNA from large volumes (250 ml) of bacterial cell cultures. Again, the suitability of this DNA for sequencing was not discussed. Both the volumes and formats reported in these studies were unsuited to large-scale sequencing; hence, we sought to adapt microwave-mediated bacterial cell lysis to 96-well format, and assess the performance of the resulting DNAs in sequencing reactions. Culturing plasmid clones and DNA purification Culture conditions were similar to those described (4). Briefly, a 12 channel pipettor was used to transfer, from a 384-well plate of glycerol stocks, 5 µl to each well of a 96-well Beckman block (Beckman part no. 140504) containing 1 ml of Terrific Broth (Difco) with the appropriate antibiotic. Blocks were covered with polystyrene lids, placed in custom holders in a Labsystems floor shaker and incubated at 37°C for 24 h with agitation at 295 r.p.m. Following growth, blocks were removed, cell pellets collected by an 8 min centrifugation at 2700 r.p.m. (1400 g) in a Jouan GR-422 centrifuge, and excess culture media decanted. Draining of residual media was achieved by inverting the block over paper toweling with intermittent rapping, after which blocks were sealed with foil tape and stored at −80°C. To purify DNA from the bacterial cell pellets, blocks were first thawed by incubation on the laboratory bench or in a 37°C water bath. Sterile de-ionized water (25 µl) was added to each well with either BIOHIT Proline 50–1200 µl or Multidrop 831 (Labsystems) pipettors and bacterial cell pellets were resuspended by vigorous vortexing using a Multi-Tube Vortexer (VWR 58816-115). Resuspension of the bacterial cell pellet was confirmed by subjecting each block to a brief vortex using a Vortex-Genie 2 model G-560 (VWR) and inspecting the wells to insure they were free of ‘clumps’. A lysis solution (70 µl), prepared fresh by mixing 200 ml STET-TWEEN20 [10 mM Tris-HCl pH 8.0, 1 mM EDTA pH 8.0, 100mMNaCl, 5%Molecular Biology Grade TWEEN20 (Sigma; P9416); stored at room temperature for at least 3 months with no decrease in DNA sequence quality] with 48 mg (4.8 ml of a 10 mg/ml solution) RNAse A (Sigma; R6513) and 100 mg (2 ml of a 50 mg/ml solution) of lysozyme (Sigma; L7651), were then added to each well with a Multi-drop pipettor. Immediately after addition of lysis solution mixing was achieved by subjecting the block to a brief vortex, with care taken to avoid the introduction of bubbles. Multiple blocks were handled serially; lysis solution was added, the block vortexed, and then the next block processed. This was followed by a 5 min incubation at room temperature, after which two blocks were placed side-by-side in a microwave oven (General Electric Model J-E693TWH 002) with a peak power output rating of 0.95 kW. The blocks were subjected to a 30 s irradiation at the maximum power setting and, to ensure uniform exposure of all wells to microwaves, removed from the microwave, rotated 180 degrees, replaced in the oven, and subjected to an additional 30 s of irradiation. The Multidrop pipettor was then used to add 350–500 µl sterile de-ionized water to each well. The volumes added varied depending upon the vector types and host cells of the clones being processed, with the optimal amount determined empirically by sequence analysis (see below). Blocks were subjected to a brief vortex to mix the well contents, sealed with foil tape, and placed in an ice water bath for up to 15 min. Bacterial cell debris was collected in the bottom of the wells by a 30 min centrifugation of the block in a Jouan GR-422 at 4000 r.p.m.g26 (∼3000 g), after which a Robbins Scientific 96-well Hydra pipettor was used to transfer 105 µl of DNA-containing supernatant from the Beckman blocks to non-tissue culture treated 96-well polystyrene Costar microtiter plates (VWR; 62408-189). This DNA-containing solution was used directly, without further purification, for DNA sequencing and restriction analysis. DNA sequencing DNA was sequenced (5) using either BigDye terminator chemistry (Perkin-Elmer/Applied Biosystems) or DYEnamic energy transfer dye primers (Amersham). Typically, ‘master mixes’ of sequencing reaction components were dispensed into Robbins cycleplates using a Robbins Hydra, which was also used subsequently to add the DNA. For BigDye terminator sequencing, a reaction cocktail consisting of 2.5 µl of terminator pre-mix, 1.0 µl (3 pM) oligonucleotide primer, and 3.5 µl of sterile deionized water was assembled. To this was added 1–2 µl of microwave lysate containing on average between ∼130 and 260 ng of DNA. A MicroAmp Full Plate Cover (Perkin-Elmer; N801-0550) was placed onto the cycleplates and the reaction mixtures collected in the bottom of the wells by centrifugation. Cycle sequencing was performed in MJ Research PTC-200 or −225 thermocyclers using 35 cycles of 95°C for 15 s, 45°C for 5 s, 60°C for 2 min, followed by incubation at 4°C. After cycling, cycleplates were centrifuged to collect the reactions in the bottom of the wells and a mixture of 1 µl of 3 M sodium acetate, pH 5.2 and 100 µl of 100% ethanol (typically added as 101 µl of a mixture of the two components) was added to each well. After mixing by repeated pipetting with a Robbins Hydra, DNA precipitates were collected by centrifuging the cycleplate for 30 min at 4000 r.p.m. Ethanol was decanted and the reaction pellet washed with 200 µl 80% ethanol. After a second centrifugation for 15 min at 4000 r.p.m., ethanol was again decanted and the samples dried for 10 min in a SpeedVac. Dried samples were placed in light-tight containers and stored at −20°C until electrophoresis on ABI 373 or 377 sequencers. For each dye primer sequencing reaction a reaction cocktail consisting of 0.4 µl Thermo Sequenase reaction buffer, 1.6 µl Thermo Sequenase Nucleix, 1 µl DYEnamic ET primer mix (at 0.1 µM for A and C reactions and 0.2 µM for G and T reactions) and 1 µl (1.5 U) Thermo Sequenase enzyme was prepared. Reaction cocktails were aliquoted into cycleplates and stored at −20°C until required. When needed, a set of four cycleplates, representing a set of 96 reactions, were thawed and centrifuged and a Robbins Hydra used to add to each well 2 µl of DNA. Cycleplates were again centrifuged, placed in thermocyclers and subjected to 15 cycles of 95°C for 4 s, 55°C for 10 s, 70°C for 1min; followed by 15 cycles of 95°C for 4 s, 70°C for 1 min. Precipitation of completed sequencing reactions was performed with a Robbins Hydra using a pooling scheme which involved addition of 100 µl of 100% ethanol and 3 µl of 5 M ammonium acetate to all ‘A’ reactions, followed by serial transfer of the ethanol/salt/DNA mixture first to the cycleplate containing the ‘C’ reactions, then the ‘G’ reactions and finally the ‘T’ reactions. Pooled reactions in the ‘T’ tray were placed on ice for 15 min and then centrifuged at 4000 r.p.m. for 30 min. Following centrifugation, trays were inverted to decant ethanol and blotted onto paper towels to facilitate draining. Reaction pellets were washed by addition of 200 µl of 70% ethanol and then centrifuged for 15 min at 4000 r.p.m. to collect the reaction pellet. Ethanol was decanted, the tray rapped gently on paper towels to remove excess ethanol, and the reaction pellets dried in a SpeedVac with the rotor removed. Reactions were stored in light-tight containers until required. Immediately prior to electrophoresis, reactions were resuspended in 2 µl of formamide/blue dextran dye and denatured by heating at 95°C for 5 min. Reactions were subjected to electrophoresis on ABI 373 or 377 sequencers with 36 cm well-to-read glass plates. For the 373, sequence data were collected on 8.3 M urea, 4.75% SeaQuate gels (Sooner Scientific, SQL 4750) run at 3000 V, 25–30 W, 40 mA for 10 h. For the 377, sequence data were collected on 6 M urea, 5% Gene Page Plus (Amresco J722-1L-UWA; degassed before use). Both Seq Run 36A-2400 (3000 V, 60 mA, 200 W; 3 h electrophoresis) and Seq Run 36A-1200 (1750 V, 50 mA, 150W; 8 h electrophoresis) run modules were used for dye primer sequencing reactions; for BigDye terminator reactions, both Seq Run 36E-2400 and Seq Run 36E-1200 run modules (conditions described above) were used. Sequencing gel images were processed using Gelminder (J.Mullikan, M.Holman and A.Chinwalla, unpublished) which incorporates the signal processor PLAN (B.Ewing and P.Green, unpublished). DNA sequence analysis We conducted qualitative evaluations of sequence electropherograms (‘traces’) generated from microwave-prepared DNA using the UNIX tool VTRACE (B.Ewing, unpublished), which facilitates manual inspection of sequence trace data. Initial sequencing results revealed traces which were ‘top heavy’-sequence-read lengths were shorter than expected, with a fragment size distribution weighted towards short sequencing extension products. These trace profiles often indicate excessive DNA in the sequencing reaction. To determine the optimal dilution of the microwave lysate a series of trial-and-error experiments were performed in which lysates were diluted with water and re-sequencing performed. Traces were then manually inspected using VTRACE and analyzed by EST_OTTO (described below). Optimal dilutions were plasmid- and host cell-dependent; for pT7T3-based plasmids (6) and pZERO plasmids (Invitrogen) propagated in DH10B cells (Life Technologies), best results were obtained by addition of 500 µl of water to the DNA preparation following microwave lysis. For pBluescript-based plasmids propagated in SOLR cells (Stratagene), equivalent sequencing results were obtained when 360 µl of water were added. After satisfactory and reproducible sequencing results had been obtained, we measured the concentration of pZERO microwave-prepared plasmids using PicoGreen stain (Molecular Probes) and a Fluorimager (Molecular Dynamics). The average concentration of 192 diluted samples, from two 96-well microtitre plates, was calculated to be 135.7 ng/µl with a standard deviation of 9.8 ng/µl (DNA concentrations ranged from 10.9 ng/µl to 448 ng/µl). A VTRACE image of a DNA sample prepared by the microwave method and sequenced with BigDye terminators is shown in Figure 1. Peaks are well resolved, and background signal does not confound the base-calling. For this sequence trace, the high quality length cut-off was 400 bases, the maximum allowed by the software. This cut-off is conservative; useful data extends well beyond, with resolvable peaks in the run of T residues near position 540. These results typified those achieved with DNA prepared using the microwave protocol and our standard sequence electrophoresis conditions for ESTs. The performance of the microwave lysates under specific conditions designed to achieve longer read lengths was not assessed. Figure 1 View largeDownload slide VTRACE output for a DNA sample prepared with the 96-well microwave method and sequenced as described with BigDye terminators. Electrophoresis was performed on an ABD 377 using 4XA run module parameters. Figure 1 View largeDownload slide VTRACE output for a DNA sample prepared with the 96-well microwave method and sequenced as described with BigDye terminators. Electrophoresis was performed on an ABD 377 using 4XA run module parameters. Quantitative evaluations of sequence quality for large numbers of traces were produced by the computer program EST_OTTO. These reports included information on the fraction of high quality (or ‘successful’) sequences and the average high quality read lengths attained for a set of sequences. Both overall sequence trace quality and the position of the last highquality base in a sequence trace can be determined. A measure of the noise-to-signal ratio (the ratio of the height of the highest non-called base to the height of the called base at that position) was used to determine overall trace quality. At least ten 16-base windows (each window overlapping by 8 bases) of the trace were required to display fewer than six ‘problem’ regions, where a problem region was defined as a peak having a noiseto-signal ratio of >0.27. To set the high quality cut-off for trace data a ‘sideband’ measure was used. This was defined as the ratio of the height of a peak's shoulder at a position in a trace halfway between adjacent peaks to the maximum height of the peak. The side band measure was used starting from the 5′ end of the trace and moving towards the 3′ end. Traces were scanned with a window of 16 bases, moved in 8 base increments, until four consecutive ‘problem’ regions were found. Here, a problem region was defined as one in which a single base in a window of 16 bases had a sideband ratio of 0.95 or more. After this position was determined, a noise-to-signal measure was applied to trim the trace back towards the 5′ end. The position of the last high-quality base in the trace was then set at the point where the noise-to-signal ratio was <0.25 for every base in two consecutive windows. These values were determined empirically: the derivation of trace quality values that described high quality sequence data resulted from efforts to identify values which best approximated a dataset of manually selected cutoffs. Table 1 View largeDownload slide EST_OTTO analysis of ESTs generated from microwave-prepared DNA aLibraries are human cDNA or SAGE libraries except as indicated. The vector and host bacterial cell line are shown for each library. bDNA templates were sequenced with energy transfer (E-T) dye primers or BigDye terminators. Gessler Wilm's tumor templates were used in the stability assay (text). Numbers in bold are sequencing results obtained in the test for DNA stability, after passage of the DNA preparations through five freeze-thaw cycles. cSee text for definition of ‘high quality’. Table 1 View largeDownload slide EST_OTTO analysis of ESTs generated from microwave-prepared DNA aLibraries are human cDNA or SAGE libraries except as indicated. The vector and host bacterial cell line are shown for each library. bDNA templates were sequenced with energy transfer (E-T) dye primers or BigDye terminators. Gessler Wilm's tumor templates were used in the stability assay (text). Numbers in bold are sequencing results obtained in the test for DNA stability, after passage of the DNA preparations through five freeze-thaw cycles. cSee text for definition of ‘high quality’. Examples of some of the output EST_OTTO reported for sequences generated from microwave lysates are shown in Table 1. From these data it is evident that the microwave method can be used to prepare sequence-quality DNA from cDNA libraries employing different vectors and propagated in different bacterial host strains. This flexibility is relevant to EST efforts, which sequence libraries obtained from diverse sources. The data in Table 1 also show that microwave lysates can be sequenced with different chemistries; we have used successfully BigDye terminator (Perkin-Elmer/Applied Biosystems) and Energy-Transfer dye primer (Amersham) technologies. Finally, Table 1 shows that the DNA prepared using this method is of sufficient stability for our purposes. Shown is an assay for stability in which 48 samples from a nephroblastoma library, constructed by M.Gessler, were subjected to five consecutive freeze-thaw cycles with temperature range from −20°C to room temperature. Subsequent resequencing resulted in no degradation of sequence quality. Week-long incubations at 4°C could be performed without affecting the integrity of the DNA (not shown) but DNA was found to degrade after a 4 day incubation at room temperature. A long-term, larger-scale assessment of sequence quality has been performed with the tool GASP (7; Fig. 2). An evaluation of 40 329 sequences, generated by BigDye terminator sequencing of microwave lysates, was performed. Here, we have used a function provided by GASP to plot the fraction of sequences (ordinate) against the number of high quality bases per read (abscissa), where base quality has been determined using the program Phred (8,9). From this analysis, 7% of the sequences have 50 or fewer high quality bases (Phred values of at least 20). The average high quality sequence length, excluding those reads with 75 or fewer high quality bases, was 427 bases, while 70.52% of the sequences had at least 400 bases of high quality data. Restriction digestion of microwave lysates In addition to sequencing, another focus of several projects has been the estimation of cDNA insert size by restriction digestion of DNA samples followed by analysis on agarose gels (4). A restriction digestion mix for cDNA libraries, constructed as described (6), consisted of 7.75 µl deionized water, 1.0 µl Buffer B (Boehringer Mannheim) and 0.125 µl (5 U) each of HindIII and EcoRI (Boehringer Mannheim). cDNA libraries constructed using different cloning strategies required different choices of enzymes to liberate the cDNA insert from the vector. Typically, sufficient ‘master mix’ for multiple 96-well digests was assembled on ice. Nine microliters of the mix were then dispensed into each well of a Robbins Scientific cycleplate using a Robbins Scientific Hydra 96 channel pipetting device, which was also used to add ∼200 ng of microwave-prepared DNA to each well. Cycleplates were sealed with foil tape, centrifuged briefly to collect the reaction components in the bottom of the wells, and incubated for 1 h at 37°C in a water bath. Two microliters of 6× loading dye (10) were added to each well and the cycleplate subjected to a second centrifugation. Ten microliters of this mixture were then loaded with a 12-channel pipettor (Oxford) into each well of a 96-well 2% agarose gel cast in 1× TAE (10). Electrophoresis in 1× TAE buffer was for ∼3 h at 80 V, or until complete as judged by monitoring the migration of the dyes in the gel. DNA was visualized by incubating gels in 0.5 µg/ml ethidium bromide and then scanning the gel using a Molecular Dynamics Fluorimager (Model SI or Model 595). 16-bit images were collected at normal sensitivity with a pixel size of 200 microns and a PMT voltage of 850. Shown in Figure 3 is an image, generated on a Molecular Dynamics Fluorimager, of an ethidium bromide stained 2% agarose gel containing 96 restriction-digested microwave lysates. These results are of a quality comparable to that achieved with DNA prepared by other methods (not shown). We noted that digested lysates often exhibited evidence of high molecular weight material, perhaps bacterial genomic DNA, migrating above the vector-specific restriction fragment. The amount of this material correlated with the extent to which vortexing was conducted (after addition of water and prior to centrifugation as described above). More vigorous vortexing produced more of this material, while in cases where vortexing was brief and gentle this material was sometimes absent. Whatever its source, this material did not affect sequencing results or confound restriction analysis. Figure 2 View largeDownload slide GASP quality analysis of EST sequences prepared using the 96-well microwave method. 40 329 sequences were analyzed with GASP using a Phred cutoff score of 20. A base having at least a score of 20 is classified here as a high quality base. MxBin, the x-axis coordinate where the distribution reaches a maximum. % 50, the percentage of reads having 50 or fewer high quality bases. % 400, the percentage of reads having 400 or more high quality bases. AvLen, the average length of the sequences analyzed, excluding sequences with 75 or fewer high quality bases. Figure 2 View largeDownload slide GASP quality analysis of EST sequences prepared using the 96-well microwave method. 40 329 sequences were analyzed with GASP using a Phred cutoff score of 20. A base having at least a score of 20 is classified here as a high quality base. MxBin, the x-axis coordinate where the distribution reaches a maximum. % 50, the percentage of reads having 50 or fewer high quality bases. % 400, the percentage of reads having 400 or more high quality bases. AvLen, the average length of the sequences analyzed, excluding sequences with 75 or fewer high quality bases. Discussion The microwave protocol described here has become the preferred method for purification of double-stranded DNA in our sequencing projects. The DNA is very easily produced, of adequate stability and generally performs very well in the sequencing and restriction analysis protocols used in our laboratory. The protocol, a favorite of our technical staff, is markedly less tedious than others due primarily to the reduced number of required manipulations. For example, growth of bacterial cultures and subsequent DNA isolation are performed in the same 96-well block, and no further purification of the DNA, by precipitation or other means, is necessary. The 96-well blocks can be re-used indefinitely provided they are cleaned between uses. Importantly for our applications, the DNA can be manipulated with the Robbins Hydra pipettor without fear of clogging or otherwise affecting the instrument. Further, the lysis solution is easily made and is stable at room temperature for a minimum of 3 months, allowing litre-quantity batches to be made and stored. And, unlike commercially available DNA preparation kits, knowledge of the reagent composition should facilitate troubleshooting when necessary. Finally, the preparation is remarkably inexpensive. We calculate that reagents cost 1.6 cents, with plastic ware adding another 1.4 cents for a total estimated cost of 3 cents per sample. This compares very favorably with the cost of commercial preparation methods, which can exceed 1 US dollar per sample. We first implemented the method in October 1997 and, through March 1999, have used it to purify more than 700 000 EST templates. During this period, application of the microwave method described here has resulted in savings of ∼$700 000 over the cost of commercially obtained kit reagents—a sum which we have applied to the generation of additional ESTs. Figure 3 View largeDownload slide 2% agarose gel stained with ethidium bromide showing restriction digests of 96 microwave lysates. Samples were processed as described (Materials and Methods).M, marker DNAs (Boehringer Mannheim Marker VI). The sizes of the individual marker fragments are given. O, origins of the gel. V, vector-specific restriction fragments. Figure 3 View largeDownload slide 2% agarose gel stained with ethidium bromide showing restriction digests of 96 microwave lysates. Samples were processed as described (Materials and Methods).M, marker DNAs (Boehringer Mannheim Marker VI). The sizes of the individual marker fragments are given. O, origins of the gel. V, vector-specific restriction fragments. Acknowledgements We are indebted to Elaine Mardis, Eric Green, Sharon Gorski and Stephanie Chissoe for comments on the manuscript and to Elizabeth Boatright for assistance in quantification of microwave lysates. For excellent technical support in all aspects of EST generation at Washington University Genome Sequencing Center we are grateful to John Martin, Cathy Beck, Karen Underwood, Michele Steptoe, Brenda Theising, Todd Wylie, Missy Allen, Yvette Bowers, Barry Person, Tim Swaller, Marilyn Gibbons, Deana Pape, Njata Harvey, Rebecca Schurk, Erica Ritter, Sophie Kohn, Tanya Shin, Marco Cardenas, Tammy Yount, Angie Blistain, Yolanda Jackson, Rhonda McCann, Ann Chamberlain, Julie Chapell, Richard Morales, Louise Bowles and Steve Geisel, and all the staff at Washington University Genome Sequencing Center. Thanks also to Christa Prange (LLNL) and Greg Lennon (GeneLogic and LLNL) for making available cDNA clones for sequence analysis. References 1 Marra M.A., Hillier L., Waterston R.H.. , Trends Genet. , 1998, vol. 14 (pg. 4- 7) CrossRef Search ADS PubMed 2 Hultner M.L., Cleaver J.E.. , Biotechniques , 1994, vol. 16 (pg. 990- 994) PubMed 3 Wang B., Merva M., Williams W.V., Weiner D.B.. , Biotechniques , 1995, vol. 18 (pg. 554- 555) PubMed 4 Hillier L., Lennon G., Becker M., Bonaldo M.F., Chiapelli B., Chissoe S., Dietrich N., DuBuque T., Favello A., Gish W., et al. , Genome Res. , 1996, vol. 6 (pg. 807- 828) CrossRef Search ADS PubMed 5 Sanger F., Coulson A.R., Barrell B.G., Smith A.J.H., Roe B.A.. , Proc. Natl Acad. Sci. USA , 1977, vol. 74 (pg. 5463- 5467) CrossRef Search ADS 6 Bonaldo M.F., Lennon G., Soares M.B.. , Genome Res. , 1996, vol. 6 (pg. 791- 806) CrossRef Search ADS PubMed 7 Wendl M.C., Dear S., Hodgson D., Hillier L.. , Genome Res. , 1998, vol. 9 (pg. 975- 984) 8 Ewing B., Hillier L., Wendl M., Green P.. , Genome Res. , 1998, vol. 8 (pg. 175- 185) CrossRef Search ADS PubMed 9 Ewing B., Green P.. , Genome Res. , 1998, vol. 8 (pg. 186- 194) CrossRef Search ADS PubMed 10 Sambrook J., Fritsch E.F., Maniatis T.. , Molecular Cloning: A LaboratoryManual , 1989 2nd Edn Cold Spring Harbor, NY Cold Spring Harbor Laboratory Press © 1999 Oxford University Press
High-throughput plasmid DNA purification for 3 cents per sampleMarra, Marco A.; Kucaba, Tamara A.; Hillier, LaDeana W.; Waterston, Robert H.
doi: N/Apmid: N/A
To accommodate the increasingly rapid rates of DNA sequencing we have developed and implemented an inexpensive, expeditious method for the purification of double-stranded plasmid DNA clones. The robust nature, high throughput, low degree of technical difficulty and extremely low cost have made it the plasmid DNA preparation method of choice in both our expressed sequence tag (EST) and genome sequencing projects. Here we report the details of the method and describe its application in the generation of more than 700 000 ESTs at a rate exceeding 16 000 per week.
Ligase-based detection of mononucleotide repeat sequencesZirvi, Monib;Barany, Francis;Nakayama, Takamori;Newman, Gregg;McCaffrey, Timothy;Paty, Philip
doi: 10.1093/nar/27.24.e40pmid: 10572192
Abstract Up to 15% of all colorectal cancers are considered to be replication error positive (RER+) and contain mutations at hundreds of thousands of microsatellite repeat sequences. Recently, a number of intragenic mononucleotide repeat sequences have been demonstrated to be targets for inactivating genes in RER+ colorectal tumors. In this study, thermostable DNA ligases were tested for the ability to detect alterations in microsatellite sequences in colon tumor samples. Ligation profiles on mononucleotide repeat sequences were determined for four related thermostable DNA ligases, Thermus thermophilus (Tth) ligase, Thermus sp. AK16D ligase, Aquifex aeolicus ligase and the K294R mutant of the Tth ligase.While the limit of detection for point mutations was one mutation in 1000 wild-type sequences, the ability to detect a single base deletion in a 10 base mononucleotide repeat was one mutation in 100 wild-type sequences. Furthermore, the misligation error increased exponentially as the length of the mononucleotide repeat increased, and was 10% of the correct signal for a 19 base mononucleotide repeat. A fluorescent ligase-based assay [polymerase chain reaction/ligase detection reaction (PCR/LDR)] correlated with results obtained using a radioactive assay to detect instability within the TGF-β Type II receptor gene. PCR/LDR was also used to detect the APCI1307K mononucleotide repeat allele which has a carrier frequency of 6.1%in Ashkenazi Jewish individuals. In a blind study, 30 samples that had been typed for the presence of the APCI1307K allele were tested. The PCR/LDR results correlated with those obtained using sequencing and allele-specific oligonucleotide hybridization for 16 samples carrying the mutation and 13 wild-type samples. Ligation assays that characterize mononucleotide repeats can be used to rapidly detect somatic mutations in tumors, and to screen for individuals who have a hereditary predisposition to develop colon cancer. Introduction Approximately 15% of the 150 000 new cases of colorectal cancer diagnosed each year demonstrate microsatellite instability (1,2). This instability is now understood to reflect the presence of the replication error positive (RER+) phenotype in these tumors (3–8). The RER+ phenotype results from an underlying molecular defect in the mismatch repair system which leads to a failure to correct frameshift errors in microsatellite repeats (9).Mutations in mismatch repair enzymes, hMSH2, hMLH1, hPMS1 and hPMS2, leading to RER+ tumors, have been found in both hereditary cancers and spontaneous cancers (10–15). Over 95% of patients with hereditary non-polyposis colorectal cancer (HNPCC) syndrome have a predisposition for the development of RER+ colorectal cancer due to inherited mutations in one of the mismatch repair enzymes (16,17). Slippage errors during DNA replication can produce alterations in a microsatellite of defined length. In general, errors in multiple microsatellite repeats are diagnostic of the RER+ phenotype (1–3). Furthermore, specific replication error mutations such as the deletion of an adenosine in an A10 repeat, have been reported in the TGF-β Type II receptor gene, providing a direct link between the RER+ phenotype and a specific pathway of tumorigenesis (18). Analyses of RER+ colorectal tumor samples indicate that >80% of such samples carry inactivating frameshift mutations in this poly(A) repeat, which represents an alternate pathway in the etiology of colorectal tumors (19,20). The adenomatous polyposis coli (APC) gene represents another important cellular regulator in colonic epithelial cells. Disruption of APC function is an early event in colorectal tumorigenesis and is present in amajority of colon tumors (21–23). Germline mutations in the APC gene lead to an inherited colon cancer syndrome known as familial adenomatous polyposis syndrome (FAP), in which affected individuals present with thousands of polyps at a young age (24,25) Vogelstein and colleagues reported a polymorphism (T→A transversion at APC nucleotide 3920, codon 1307) found in 6.1% of Ashkenazi Jews and ∼28% of Ashkenazim with a family history of colon cancer (26). They have postulated that rather than altering the function of the encoded protein, this polymorphism creates a small hypermutable region of the gene, indirectly causing cancer predisposition through acquired somatic mutations. This transversion occurs within an interrupted stretch of seven poly(A) nucleosides. The polymorphism converts this region of the gene into an eight nucleotide long poly(A) microsatellite repeat. The somatic mutations observed in tumors from patients carrying this allele primarily include slippage within the microsatellite repeat, as well as some point mutations in the region (26). The majority of mutation detection methods rely on the polymerase chain reaction (PCR) to initially amplify a target DNA region followed by subsequent analysis using either restriction endonuclease digestion, dideoxy sequencing, allelespecific hybridization, or polymerase-based detection (27). Microsatellites, in particular, are analyzed by PCR amplification followed by electrophoretic separation on gels (28,29). These methods are not well suited for the detection of variations occurring within mononucleotide repeats and closely clustered mutations, especially in tumor samples. Our laboratory has developed a combined polymerase chain reaction/ligase detection reaction (PCR/LDR) method for discriminating single-base mutations (30–33). In LDR, thermo-stable DNA ligase specifically links two adjacent oligonucleotides when annealed at 65°C to a complementary target only when the nucleotides are perfectly base-paired at the junction. A single-base mismatch at the junction prevents ligation/amplification and is thus distinguished from a perfect match. While frameshift mutations occurring in mononucleotide repeats cannot be detected using allele-specific PCR, they may be amenable to detection using PCR/LDR. The feasibility of using PCR/LDR to detect the presence of a single nucleotide deletion in an A10 repeat sequence was tested using synthetic substrates. Subsequently, the assay was extended to detect alterations in mononucleotide repeat sequences in clinical samples. Finally, PCR/LDR was used to detect the APCI1307K polymorphism from DNA isolated from both blood and from archival paraffin-embedded colorectal tumor samples. Materials and Methods Oligonucleotide synthesis and purification Oligonucleotides were synthesized on an ABI 394 DNA Synthesizer (PE Biosystems Inc., Foster City, CA). Oligonucleotides used in LDR were purified by electrophoresis on 10% polyacrylamide/7 M urea gels. Bands were visualized by UV shadowing, excised from the gel, and eluted overnight at 64°C in TNE buffer (100 mM Tris-HCl pH 8.0, 500 mM NaCl, 5 mM EDTA). The oligonucleotides were recovered from the eluate using C18 Sep-Pak cartridges (Waters Corp., Milford, MA) following the manufacturer's instructions. Oligonucleotides were resuspended to ∼1 mM in 100 µl TE (10 mM Tris pH 8.0, 1 mM EDTA). For LDR, gel purified stock solutions were diluted to 100 µM (100 pmol/µl). The upstream, or discriminating oligonucleotides were labeled with fluorescent reporter groups at their 5′ termini during synthesis using Fam or Tet phosphoramidites (PE Biosystems Inc., Foster City, CA). The downstream, or common oligonucleotides, were phosphorylated at the 5′ end using a chemical phosphorylation reagent, and blocked at the 3′ end using a 3′-spacer C3 CPG (Glen Research, Sterling, VA). DNA extraction from cell lines Cell lines (SW620, HCT116 and DLD1) were grown in RPMI culture media with 10% fetal bovine serum. Harvested cells (∼1 × 107) were resuspended in DNA extraction buffer (10 mM Tris-HCl pH 7.5, 150 mM NaCl, 2 mM EDTA) containing 0.5% SDS and 200 µg/ml proteinase K and incubated at 37°C for 4 h. Next, a 30% vol of 5 M NaCl was added and the mixture was centrifuged. DNA was precipitated from the supernatant with 3 vol of ethanol (EtOH), washed with 70% EtOH and resuspended in TE buffer (10 mM Tris-HCl pH 7.5, 2 mM EDTA pH 8.0). DNA extraction from paraffin sections DNA was prepared from 55 paraffin-embedded archival colon tumors and 36 paraffin-embedded pancreatic tumors. These samples represent a series of primary cancers removed by surgical resection at Memorial Sloan-Kettering Cancer Center. Tissue sections (10 sections, 10 µm thick) were cut from paraffin blocks and excess paraffin was removed. Samples were deparaffinized via sequential extraction with xylene, 100% EtOH and acetone, and dried under vacuum. The pellet was incubated overnight with proteinase K (200 µg/ml in 50 mMTris-HCl pH 8.5, 1 mM EDTA and 0.5% Tween-20) at 55°C. After heating at 100°C for 10 min, debris was removed by centrifugation and the supernatant was stored at 4°C. Greater purity of DNA was achieved by phenol-chloroform extraction and ethanol precipitation, or by using the QIAamp Tissue Kit (Qiagen, Chatsworth, CA) when needed. PCR conditions PCR amplifications were carried out in a volume of 50 µl in 10 mM Tris-HCl buffer pH 8.3 containing 10 mM KCl, 4.0 mM MgCl2, 250 µM dNTPs, 1 µM forward and reverse primers (50 pmol of each primer), and between 50 and 100 ng of genomic DNA extracted from paraffin blocks or from cell lines. After a 10 min denaturation step, 1.5 U of Amplitaq Gold DNA polymerase (Perkin Elmer, Norwalk, CT) was added under hot start conditions, and amplification achieved by thermally cycling for 35 or 40 cycles of 94°C for 30 s, 60°C for 1 min, 72°C for 1 min and 72°C for 3 min for a final extension. Four microliters of the PCR product were analyzed on a 2% agarose gel to verify presence of amplification product of the expected size. For DNA isolated from paraffin sections, Pfu polymerase (Stratagene, La Jolla, CA) was used for PCR amplification following the manufacturer's instructions. The sequences of PCR primers used are listed in Table 1. LDR conditions LDR reactions were carried out in a 20 µl mixture containing 20 mMTris-HCl pH 7.6, 10 mMMgCl2, 100 mMKCl, 10 mM DTT, 1 mM NAD+, 25 nM (500 fmol) of the detecting primers and mixtures of PCR products from cell lines or tumor samples. The reaction mixture was heated for 1.5 min at 94°C prior to adding 25 fmol of the wild-type or mutant Thermus thermophilus (Tth) DNA ligase. Ligases were overproduced and purified as described previously (34). Tth DNA ligase is available from New England Biolabs. LDR reactions were thermally cycled for 20 cycles of 15 s at 94°C and 2 min at 65°C. Reactions were stopped by adding 0.5 µl of 0.5 mM EDTA. Aliquots of 2.5 µl of the reaction products were mixed with 2.5 µl of loading buffer (83% formamide, 8.3 mM EDTA and 0.17% Blue Dextran). The mixture was supplemented with 0.5 µl Rox-1000, or TAMRA 350 molecular weight marker, denatured at 94°C for 2 min, chilled rapidly on ice prior to loading on an 8 M urea-10% polyacrylamide gel, and electrophoresed on an ABI 373 DNA sequencer at 1400 V for 2 h. Fluorescent ligation products were analyzed and quantified using the ABI Gene Scan 672 software. The amount of product is calculated from a calibration curve (one fmol = 600 peak area units). The sequences of LDR primers used are listed in Table 1. Table 1 View largeDownload slide PCR and LDR primers used to analyze mutations in intragenic mononucleotide repeats in the TGF-β Type II receptor and APC genes Table 1 View largeDownload slide PCR and LDR primers used to analyze mutations in intragenic mononucleotide repeats in the TGF-β Type II receptor and APC genes Sequencing PCR products were sequenced using fluorescent dideoxy nucleotides and Taq FS polymerase (Perkin Elmer Cetus, Norwalk, CT). Following an initial denaturation at 94°C for 2 min, the sequencing reactions were carried out using 25 cycles of 94°C for 10 s, 50°C for 5 s, 60°C for 3 min. The products were electrophoresed as described above and analyzed using ABI Genescan sequence analysis software. Restriction analysis PCR primers used to convert the APC1307 region to an MboII site or an AseI site were 5′-CTG CTA ATA CCC TGC AAA TAG CAG AA G-3′ and 5′-AGA TTC TGC TAA TAC CCT GCA AAT AGC ATT AA-3′, respectively. The reverse primer used for PCR in both cases was 5′-TGC TGT GAC ACT GCT GGA ACT TC-3′. Aliquots of PCR reactions were digested using either MboII or AseI (New England Biolabs, Beverly, MA) in the appropriate buffers. The digested products were electrophoresed on a 3% agarose gel, and stained with ethidium bromide. Results and Discussion Detection of mononucleotide repeat polymorphisms using thermostable ligase A preliminary assay was devised to determine whether a ligase-based assay could detect slippage mutations in a mononucleotide microsatellite repeat sequence. The region of the TGF-β Type II receptor region containing a mononucleotide repeat sequence was synthesized and used to test the specificity of Tth ligase in discriminating a mononucleotide repeat of A10 from A9 and A8 (Fig. 1, top). NAD+-dependent thermostable ligases cloned from Aquifex aeolicus, T.thermophilus and Thermus AK16D, were compared using this ligation assay. All of these thermostable DNA ligases demonstrated high fidelity in an LDR assay for frameshift mutations (results for Tth ligase are shown in Fig. 1, bottom). In order to test whether the position of the ligation junction along a repetitive sequence would affect ligation fidelity, ligation primers which positioned the ligation junction at different points along the mononucleotide repeat sequence were synthesized. Templates containing either nine or 10 mononucleotide repeats generated ligation products only in the presence of oligonucleotides that perfectly matched the number of repeated nucleotides. The site of the ligation junction along the nucleotide repeat did not influence the ability of the oligonucleotides to ligate. Thus, for short mononucleotide repeats, the ligation fidelity was determined to be independent of position of the ligation junction along the mononucleotide repeat sequence (Fig. 2). A study of steady-state kinetics for discrimination of a G:T mismatch demonstrated that a K294R mutant Tth ligase, Tsp AK16D ligase and Aquifex ligase demonstrated greater ligation fidelity compared to the wild-type Tth DNA ligase (35,36). A comparison of these four ligases on a mononucleotide target sequence yielded similar ligation profiles for each enzyme. These enzymes could detect the presence of a single-base frameshift mutation in up to a 100-fold excess of wild-type DNA sequence (Fig. 3). Calibration curves were constructed for the Tth DNA ligase enzyme to compare fidelity for point mutations versus frameshift mutations in a repetitive sequence (Fig. 4). LDR could detect one point mutation in 1000 copies of wild-type sequence and was quantitative over a three-log range (33). However, for frameshift mutations in a repetitive region, LDR was quantitative for less than a two-log range and could detect one mutation in 100 copies of wild-type sequence. On the other hand, the Pyrococcus and T4 ligases could not detect single-base frameshift mutations within mononucleotide repeats due to the high level of background misligation on wild-type sequences (data not shown). This difference was primarily a result of the higher rate of misligation for the frameshift mutations, as compared to the misligation rate for point mutations. Figure 1 View largeDownload slide (Top) Synthetic substrate assay designed to test LDR on mononucleotide repeat sequences. Primers designed to detect A10, A9 and A8 templates. Primer Com A4 is phosphorylated on its 5′ terminus and has a Fam-label on its 3′ terminus. The primers LP3′-A6, LP3′-A5 and LP3′-A4, have repeat sequences at their 3′ termini and varying length tails at their 5′ termini. Primer 5′Tet-Com-3T5.rev has a Tet-label on its 5′ terminus and five thymidines on its 3′ terminus. The primers LP5′-T3.rev, LP5′-T4.rev and LP5′-T5.rev are phosphorylated on their 5′ termini and have varying length tails on their 3′ termini. The ligase reaction was carried out using 500 fmol of each primer and 50 fmol of template DNA in a 20 µl reaction with 25 fmol of Tth ligase per reaction. After ligation, the primers were separated and analyzed on an ABI 373 automated DNA sequencer. (Bottom) The primers designed for the poly(A) target were tested using the LDR on a template containing an A9 repeat. The ligation reaction could determine the length of the template in either a uniplex (side lanes) or multiplex format (center two lanes). There was only minimal misligation in the last lane. Figure 1 View largeDownload slide (Top) Synthetic substrate assay designed to test LDR on mononucleotide repeat sequences. Primers designed to detect A10, A9 and A8 templates. Primer Com A4 is phosphorylated on its 5′ terminus and has a Fam-label on its 3′ terminus. The primers LP3′-A6, LP3′-A5 and LP3′-A4, have repeat sequences at their 3′ termini and varying length tails at their 5′ termini. Primer 5′Tet-Com-3T5.rev has a Tet-label on its 5′ terminus and five thymidines on its 3′ terminus. The primers LP5′-T3.rev, LP5′-T4.rev and LP5′-T5.rev are phosphorylated on their 5′ termini and have varying length tails on their 3′ termini. The ligase reaction was carried out using 500 fmol of each primer and 50 fmol of template DNA in a 20 µl reaction with 25 fmol of Tth ligase per reaction. After ligation, the primers were separated and analyzed on an ABI 373 automated DNA sequencer. (Bottom) The primers designed for the poly(A) target were tested using the LDR on a template containing an A9 repeat. The ligation reaction could determine the length of the template in either a uniplex (side lanes) or multiplex format (center two lanes). There was only minimal misligation in the last lane. Figure 2 View largeDownload slide LDR results for varying the position of the ligation junction through a repeat region. The ligation reaction was tested using common primers that contained from one to six adenosines at their 3′ termini (lower set of numbers above gel) which were paired with discriminating primers that contained the remaining adenosines required to form an A10 or A9 repeat (upper set of numbers above gel). The results of ligation are shown for both A10 and A9 templates. Figure 2 View largeDownload slide LDR results for varying the position of the ligation junction through a repeat region. The ligation reaction was tested using common primers that contained from one to six adenosines at their 3′ termini (lower set of numbers above gel) which were paired with discriminating primers that contained the remaining adenosines required to form an A10 or A9 repeat (upper set of numbers above gel). The results of ligation are shown for both A10 and A9 templates. Figure 3 View largeDownload slide Ligation profiles of the wild-type Tth, K294R Tth, Thermus AK16D and Aquifex DNA ligases for different concentrations of target DNA template. Various concentrations of A9 template (from 0 to 200 fmol) were diluted into 1000 fmol of A10 template. Ligase detection reactions were carried out as described in Materials and Methods. The graph depicts the amount of LDR product formed for each enzyme. Figure 3 View largeDownload slide Ligation profiles of the wild-type Tth, K294R Tth, Thermus AK16D and Aquifex DNA ligases for different concentrations of target DNA template. Various concentrations of A9 template (from 0 to 200 fmol) were diluted into 1000 fmol of A10 template. Ligase detection reactions were carried out as described in Materials and Methods. The graph depicts the amount of LDR product formed for each enzyme. Figure 4 View largeDownload slide Calibration curves for LDR for point mutations and frameshift mutations using Tth DNA ligase. The ‘C’ point represents the amount of noise generated from misligation of the LDR primers on the excess wild-type template present in the reaction. The ligase detection reaction is quantitative over three orders ofmagnitude for pointmutations (A), but less than two orders of magnitude for frameshift mutations in repetitive regions (B). In both cases, the ligase detection reactions were carried out using a dilution of mutant template in 1000 fmol of wild-type template. The detection limit is one mutant in 1000 wild-type sequences for point mutations and one mutant in 100 wild-type sequences for slippage mutations in repetitive sequences. Figure 4 View largeDownload slide Calibration curves for LDR for point mutations and frameshift mutations using Tth DNA ligase. The ‘C’ point represents the amount of noise generated from misligation of the LDR primers on the excess wild-type template present in the reaction. The ligase detection reaction is quantitative over three orders ofmagnitude for pointmutations (A), but less than two orders of magnitude for frameshift mutations in repetitive regions (B). In both cases, the ligase detection reactions were carried out using a dilution of mutant template in 1000 fmol of wild-type template. The detection limit is one mutant in 1000 wild-type sequences for point mutations and one mutant in 100 wild-type sequences for slippage mutations in repetitive sequences. As the length of mononucleotide repeats is increased, a sequence will be reached for which the Tth DNA ligase would tolerate single-base insertions or deletions. To determine this length, templates containing longer poly(A) stretches were synthesized, gel-purified and tested. In an LDR assay, the Tth DNA ligase was able to distinguish an A16 template from an A15 template. There was only a slight increase in the amount of product generated from misligations for the incorrect template, ranging up to 4% of the correct product signal. For LDR on templates up to 19 mononucleotide repeats, the amount of signal generated from incorrect hybridization of the LDR primers to the synthetic templates increased to ∼10% of the correct product signal. PCR amplification of mononucleotide repeat sequences The greatest source of error for analysis of mononucleotide repeat sequences, however, is the error generated during PCR amplification of microsatellite repeats. Polymerases that lack exonucleolytic activity typically generate non-templated 3′-terminal nucleotides that interfere with proper sizing of PCR amplicons. In addition, polymerases that possess exonuclease activity remove 3′-terminal bases from PCR amplicons and therefore produced a mixed population of PCR products (37). In either case, determination of mononucleotide repeat length based solely on the size of PCR products is complicated by these factors. In contrast, neither of these modifications of PCR products affect the ability to use LDR to determine the size of a mononucleotide repeat in a PCR product. Accumulation of errors during PCR amplification usually compounds the error rate of the DNA polymerase utilized (38). Therefore, even a moderate increase in replication fidelity of a polymerase can produce significant improvements in the fidelity of PCR amplification A variety of PCR conditions were tested for genomic DNA samples in order to minimize the amount of slippage in the repeat regions during PCR amplification. The enzymes tested included Taq and AmpliTaq Gold polymerase, Vent (exo+) polymerase, Vent (exo−) polymerase, UlTma polymerase and Pfu polymerase. Each of the polymerases was tested in its native buffers with 2 mM Mg2+ and a dNTP concentration of 250 µM. Of the polymerases tested, Taq demonstrated the least fidelity, generating 50% slippage error when amplifying a control plasmid containing the TGF-β Type II receptor A10 repeat sequence. The polymerase with the greatest fidelity in amplification of this A10 region was Pfu polymerase, with an error band that was <20% of the control band after 30 rounds of PCR amplification. However, for amplification of polyguanosine or polycytosine mononucleotide repeats (i.e., the C8 region of the BAX gene), AmpliTaq polymerase did not generate slippage artifacts (data not shown). Detection of mononucleotide repeat mutations in TGF-β Type II receptor using PCR/LDR The HCT116 cell line has been reported in the literature to be RER+ and demonstrates slippage in the TGF-β Type II receptor (18,39,40). Using optimized amplification conditions, genomic DNA extracted from the HCT116 cell line and human umbilical cord blood were typed to verify the presence and absence of mononucleotide slippage mutations. Subsequently, 12 carotid endarterectomy specimens (40) were typed using PCR/LDR and the results were compared to those obtained by PCR amplification followed by restriction digestion of the PCR amplicon to generate DNA fragments that could be sized precisely. The results were concordant using both methods for all of the samples tested and there was quantitative correlation between the results in each case (Fig. 5). Detection of the APCI1307K mononucleotide repeat polymorphism in clinical samples using PCR/LDR The APCI1307K allele creates an A8 stretch within the coding region of the APC gene. LDR primers were designed for the detection of the APCI1307K allele and were initially tested on synthetic substrates. Blinded genomic DNA samples, initially typed using allele specific oligonucleotides (ASOs) were obtained from Dr Ostrer's laboratory at the Human Genetics Program at New York University Medical College. All positive samples and some negative samples had been sequenced to confirm the initial result in these samples (26). Thirty samples were tested using PCR/LDR in an initial study and the same 30 samples were resorted and retested in a follow-up study. Representative PCR/LDR results for detection of the APCI1307K allele are shown (Fig. 6, top). In both studies, the results obtained using the PCR/LDR method matched those obtained by the hybridization and sequencing except for one sample. This sample was repeatedly typed as positive by PCR/LDR and had been typed as negative by ASO hybridization. The sample was sequenced using fluorescent dideoxy sequencing and the presence of the APCI1307K allele was confirmed. This sample therefore represents a false negative missed by the hybridization method and correctly typed using the PCR/LDR primers. Figure 5 View largeDownload slide Comparison of PCR/LDR with PCR/RE for analysis of TGF-β Type II receptor slippage mutations. Genomic DNA from carotid endarterectomy specimens were analyzed for mutations in the TGF-β Type II receptor by primary PCR amplification followed by either restriction digestion to remove PCR sizing artifacts or LDR. The relative percentage of mutant compared to wild-type signals were quantitated using a phosphoimager for PCR/RE and using an ABI 373 for PCR/LDR. (Top) Graph depicting the correlation between the two methods. (Bottom) Gel results from PCR/RE and PCR/LDR analysis of 12 carotid endarterectomy specimens. The samples included two mutant samples and one sample which was a primary culture of a mutant sample that was partially transfected with a plasmid containing wild-type TGF-β Type II receptor. Figure 5 View largeDownload slide Comparison of PCR/LDR with PCR/RE for analysis of TGF-β Type II receptor slippage mutations. Genomic DNA from carotid endarterectomy specimens were analyzed for mutations in the TGF-β Type II receptor by primary PCR amplification followed by either restriction digestion to remove PCR sizing artifacts or LDR. The relative percentage of mutant compared to wild-type signals were quantitated using a phosphoimager for PCR/RE and using an ABI 373 for PCR/LDR. (Top) Graph depicting the correlation between the two methods. (Bottom) Gel results from PCR/RE and PCR/LDR analysis of 12 carotid endarterectomy specimens. The samples included two mutant samples and one sample which was a primary culture of a mutant sample that was partially transfected with a plasmid containing wild-type TGF-β Type II receptor. In a subsequent study to determine whether PCR/LDR could be used to screen for the APCI1307K allele in archival specimens, genomic DNA obtained from paraffin-embedded tissue was analyzed (33). In this study, archival tumor specimens from 55 Ashkenazi Jewish patients treated at Memorial Sloan-Kettering Cancer Center were analyzed (41). Genomic DNA extracted from these samples was analyzed using PCR/LDR, and seven of the 55 samples were found to carry the APCI1307K allele. Four of the positive samples found by PCR/LDR analysis are shown (Fig. 6, bottom). In contrast, only one of the 36 pancreatic cancer samples tested positive for the APCI1307K allele. The prevalence rate of the allele among the colorectal cancer samples (12.7%) was similar to the rate published in the original Vogelstein study (10.4%) (26). This rate is significantly higher than the 6.1% carrier frequency of the polymorphism in the Ashkenazi Jewish population (χ2 = 4.4, P < 0.036). This result is in agreement with the statistically significant association obtained between the development of colorectal cancer and the APCI1307K allele in the study by Laken et al. (26). Comparison of amplification-created restriction site (ACRS) analysis and PCR/LDR for detecting the APCI1307K mononucleotide repeat polymorphism The seven APCI1307K+ tumor samples were retested using PCR/LDR to confirm the presence of the APCI1307K allele. The samples were also sequenced and the presence of the allele was confirmed in the seven samples. ACRS analysis was also used in order to confirm the presence of the APCI1307K allele. The restriction enzyme used was AseI, which requires two base substitutions to convert the wild-type APC1307 sequence (AGAAAT) to its aTTAAT recognition sequence. The AseI conversion leaves the APCI1307K allele undigested, but digests sequences that were originally wild-type. Using the AseI conversion method, the presence of the APCI1307K allele was confirmed for the seven APCI1307K+ tumor samples, although PCR amplified fragments resistant to AseI digestion were also observed in wild-type samples (data not shown). Attempts to create an MseI restriction site (AAAT to AATT) gave false positive results, the consequence of polymerase misextension from a 3′-terminal mismatched base (42,43). Concluding Remarks PCR/LDR has been shown to be a powerful tool in cancer and disease gene mutation detection (30—33,44). In this study, the specificity and fidelity limits of different thermostable ligase enzymes were compared for the analysis of polymorphisms in mononucleotide repeat sequences of different lengths. PCR conditionswith various thermostable polymerases were compared in order to create a PCR/LDR assay for characterizing microsatellite instability in clinical samples. The optimal PCR conditions for amplification of polyadenosine mononucleotide repeat DNA necessitated the use of thermostable polymerases with exonucleolytic proofreading activity. The genetic alterations found in RER+ colorectal tumors occur primarily in repetitive DNA sequences such as mononucleotide repeats. RER+ tumors are less likely to have mutations in p53, are poorly differentiated, and are more prone to metastasize to other organs (45). Approximately 89% of patients with multiple primary tumors have RER+ tumors (46). Genes containing intragenic mononucleotide repeats which have been shown to be altered in RER+ colorectal tumors include the TGF-β Type II receptor, Bax gene, hMSH3 and hMSH6 genes (1,2,18,47–53). These intragenic mononucleotide repeats are all amenable to analysis using PCR/LDR (unpublished results). Furthermore, screening for the APCI1307K polymorphism was demonstrated on both genomic DNA from whole blood and archival tumor specimens from Ashkenazi Jewish individuals. Figure 6 View largeDownload slide (Top) Detection of the APCI1307K allele in genomic DNA isolated from blood samples of Ashkenazi Jewish individuals using PCR/LDR. The upper band in each lane represents the mutant APCI1307K allele and the lower band represents the wild-type allele. In each case, the results were concordant with results obtained using hybridization with ASOs. +, mutation present; −, mutation absent. (Bottom) Detection of the APCI1307K allele in paraffin-embedded archival colorectal tumors. Genomic DNA was extracted from paraffin-embedded colorectal tumors from Ashkenazi Jewish individuals. The DNA was analyzed for the APCI1307K allele using PCR/LDR. The gel shows the results for 24 samples in which samples 5, 9, 13 and 16 were found to carry the APCI1307K allele. Figure 6 View largeDownload slide (Top) Detection of the APCI1307K allele in genomic DNA isolated from blood samples of Ashkenazi Jewish individuals using PCR/LDR. The upper band in each lane represents the mutant APCI1307K allele and the lower band represents the wild-type allele. In each case, the results were concordant with results obtained using hybridization with ASOs. +, mutation present; −, mutation absent. (Bottom) Detection of the APCI1307K allele in paraffin-embedded archival colorectal tumors. Genomic DNA was extracted from paraffin-embedded colorectal tumors from Ashkenazi Jewish individuals. The DNA was analyzed for the APCI1307K allele using PCR/LDR. The gel shows the results for 24 samples in which samples 5, 9, 13 and 16 were found to carry the APCI1307K allele. Multiplexed ligation reactions have now been validated to distinguish 19 possiblemutations in K-ras (33), over 30mutations in the cystic fibrosis transmembrane conductance regulator gene (44), and over 55 mutations in the p53 gene (unpublished results). PCR/LDR reactions are compatible with a universal DNA array detection scheme (54), and with this work, we extend this versatility to include mononucleotide repeat sequences as well. This level of multiplexing is not possible with other mutation detection methods such as allele-specific PCR (AS-PCR), ASO hybridization, dideoxy sequencing, or restriction endonuclease digestion. For AS-PCR, attempts to analyze more than four mutations have required splitting different sets of primers into different tubes either because of closely clustered mutations or failure to PCR amplify all allele-specific primers simultaneously (55–61). Likewise, restriction digestion methods are not amenable to multiplexing. Further, for both allele-specific andACRS techniques, false positive results may be obtained due to polymerase misextension from a 3′-terminal mismatched base (42,43). Molecular characterization of colorectal tumors may eventually play a role in the clinical management of this disease. There is already some indication that adjuvant chemotherapy based on the status of molecular markers may be beneficial for some patients (62). Microsatellite instability in colorectal tumors is associated with diploid status, tumors in the proximal colon, and poor differentiation (63). Preliminary studies have also shown that patients with tumors that demonstrated microsatellite instability at two or more loci had an increased overall survival rate when compared to patients with tumors which lacked genomic instability (64). The ability to rapidly characterize alterations in microsatellite repeats using ligase-based assays may eventually aid in the clinicalmanagement of colorectal cancer. Acknowledgements The authors would like to acknowledge Harry Ostrer and Carol Oddoux from the Human Genetic Program at New York University Medical College for providing blind genomic DNA samples previously typed for the APCI1307K allele from Ashkenazi Jewish individuals for the APCI1307K experiments. The authors would also like to acknowledge Norman Gerry, Reyna Favis, Weiguo Cao, Jie Tong, Joseph Day, Andrew Grace and Matt D'Alessio for their helpful suggestions and technical assistance. M.Z. was supported by the William M. Keck Fellowship and the Tri-Institutional Medical Scientist Training Program (MSTP) Fellowship during the course of this study. Support for this work was provided by the National Cancer Institute (P01-CA65930 and RO1-CA81467). T.M. and G.N. were supported by NHLBI-SCOR HL56987 and NIA AG12712. References 1 Ionov Y., Peinado M.A., Malkhosyan S., Shibata D., Perucho M.. , Nature , 1993, vol. 363 (pg. 558- 561) CrossRef Search ADS PubMed 2 Shibata D., Peinado M.A., Ionov Y., Malkhosyan S., Perucho M.. , Nat. Genet. , 1994, vol. 6 (pg. 273- 281) CrossRef Search ADS PubMed 3 Chen J., Heerdt B.G., Augenlicht L.H.. , Cancer Res. , 1995, vol. 55 (pg. 174- 180) PubMed 4 Cawkwell L., Lewis F.A., Quirke P.. , Br. J. Cancer , 1994, vol. 70 (pg. 813- 818) CrossRef Search ADS PubMed 5 Liu B., Nicolaides N.C., Markowitz S., Willson J.K., Parsons R.E., Jen J., Papadopolous N., Peltomaki P., de la Chapelle A., Hamilton S.R., et al. , Nat. Genet. , 1995, vol. 9 (pg. 48- 55) CrossRef Search ADS PubMed 6 Orth K., Hung J., Gazdar A., Bowcock A., Mathis J.M., Sambrook J.. , Proc. Natl Acad. Sci. USA , 1994, vol. 91 (pg. 9495- 9499) CrossRef Search ADS 7 Aaltonen L.A., Peltomaki P., Leach F.S., Sistonen P., Pylkkanen L., Mecklin J.P., Jarvinen H., Powell S.M., Jen J., Hamilton S.R., et al. , Science , 1993, vol. 260 (pg. 812- 816) CrossRef Search ADS PubMed 8 Risinger J.I., Berchuck A., Kohler M.F., Watson P., Lynch H.T., Boyd J.. , Cancer Res. , 1993, vol. 53 (pg. 5100- 5103) PubMed 9 Perucho M.. , Nature Med. , 1996, vol. 2 (pg. 630- 631) CrossRef Search ADS 10 Fishel R., Lescoe M.K., Rao M.R.S., Copeland N.G., Jenkins N.A., Garber J., Kane M., Kolodner R.. , Cell , 1993, vol. 75 (pg. 1027- 1038) CrossRef Search ADS PubMed 11 Parsons R., Li G.M., Longley M.J., Fang W.H., Papadopoulos N., Jen J., de la Chapelle A., Kinzler K.W., Vogelstein B., Modrich P.. , Cell , 1993, vol. 75 (pg. 1227- 1236) CrossRef Search ADS PubMed 12 Papadopoulos N., Nicolaides N.C., Wei Y.F., Ruben S.M., Carter K.C., Rosen C.A., Haseltine W.A., Fleischmann R.D., Fraser C.M., Adams M.D., et al. , Science , 1994, vol. 263 (pg. 1625- 1629) CrossRef Search ADS PubMed 13 Leach F.S., Nicolaides N.C., Papadopoulos N., Liu B., Jen J., Parsons R., Peltomaki P., Sistonen P., Aaltonen L.A., Nystrom-Lahti M., et al. , Cell , 1993, vol. 75 (pg. 1215- 1225) CrossRef Search ADS PubMed 14 Nicolaides N.C., Papadopoulos N., Liu B., Wei Y.F., Carter K.C., Ruben S.M., Rosen C.A., Haseltine W.A., Fleischmann R.D., Fraser C.M., et al. , Nature , 1994, vol. 371 (pg. 75- 80) CrossRef Search ADS PubMed 15 Bronner C.E., Baker S.M., Morrison P.T., Warren G., Smith L.G., Lescoe M.K., Kane M., Earabino C., Lipford J., Lindblom A., et al. , Nature , 1994, vol. 368 (pg. 258- 261) CrossRef Search ADS PubMed 16 Lynch H.T., Smyrk T.. , Cancer , 1996, vol. 78 (pg. 1149- 1167) CrossRef Search ADS PubMed 17 Brassett C., Joyce J.A., Froggatt N.J., Williams G., Furniss D., Walsh S., Miller R., Evans D.G., Maher E.R.. , J. Med. Genet. , 1996, vol. 33 (pg. 981- 985) CrossRef Search ADS PubMed 18 Markowitz S., Wang J., Myeroff L., Parsons R., Sun L., Lutterbaugh J., Fan R.S., Zborowska E., Kinzler K.W., Vogelstein B., et al. , Science , 1995, vol. 268 (pg. 1336- 1338) CrossRef Search ADS PubMed 19 Parsons R., Myeroff L.L., Liu B., Willson J.K., Markowitz S.D., Kinzler K.W., Vogelstein B.. , Cancer Res. , 1995, vol. 55 (pg. 5548- 5550) PubMed 20 Brattain M.G., Markowitz S.D., Willson J.K.. , Curr. Opin. Oncology , 1996, vol. 8 (pg. 49- 53) CrossRef Search ADS 21 Ilyas M., Tomlinson I.P.. , J. Pathol. , 1997, vol. 182 (pg. 128- 137) CrossRef Search ADS PubMed 22 Powell S.M., Zilz N., Beazer-Barclay Y., Bryan T.M., Hamilton S.R., Thibodeau S.N., Vogelstein B., Kinzler K.W.. , Nature , 1992, vol. 359 (pg. 235- 237) CrossRef Search ADS PubMed 23 Kinzler K.W., Vogelstein B.. , Cell , 1996, vol. 87 (pg. 159- 170) CrossRef Search ADS PubMed 24 Fearon E.R., Vogelstein B.. , Cell , 1990, vol. 61 (pg. 759- 767) CrossRef Search ADS PubMed 25 Eckert W.A., Jung C., Wolff G.. , J. Med. Genet. , 1994, vol. 31 (pg. 442- 447) CrossRef Search ADS PubMed 26 Laken S.J., Petersen G.M., Gruber S.B., Oddoux C., Ostrer H., Giardiello F.M., Hamilton S.R., Hampel H., Markowitz A., Klimstra D., Jhanwar S., Winawer S., Offit K., Luce M.C., Kinzler K.W., Vogelstein B.. , Nat. Genet. , 1997, vol. 17 (pg. 79- 83) CrossRef Search ADS PubMed 27 Weber J.L.. , Curr. Opin. Biotechnol. , 1990, vol. 1 (pg. 166- 171) CrossRef Search ADS PubMed 28 Mansfield D.C., Brown A.F., Green D.K., Carothers A.D., Morris S.W., Evans H.J., Wright A.F.. , Genomics , 1994, vol. 24 (pg. 225- 233) CrossRef Search ADS PubMed 29 Ziegle J.S., Su Y., Corcoran K.P., Nie L., Mayrand P.E., Hoff L.B., McBride L.J., Kronick M.N., Diehl S.R.. , Genomics , 1992, vol. 14 (pg. 1026- 1031) CrossRef Search ADS PubMed 30 Barany F.. , Proc. Natl Acad. Sci. USA , 1991, vol. 88 (pg. 189- 193) CrossRef Search ADS 31 Barany F.. , PCR Methods Appl. , 1991, vol. 1 (pg. 5- 16) CrossRef Search ADS PubMed 32 Day D., Speiser P.W., White P.C., Barany F.. , Genomics , 1995, vol. 29 (pg. 152- 162) CrossRef Search ADS PubMed 33 Khanna M., Park P., Zirvi M., Paty P., Barany F.. , Oncogene , 1999, vol. 18 (pg. 27- 38) CrossRef Search ADS PubMed 34 Barany F., Gelfand D.. , Gene , 1991, vol. 109 (pg. 1- 11) CrossRef Search ADS PubMed 35 Luo J., Bergstrom D.E., Barany F.. , Nucleic Acids Res. , 1996, vol. 24 (pg. 3071- 3078) CrossRef Search ADS PubMed 36 Tong J., Cao W., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 788- 794) CrossRef Search ADS PubMed 37 Hite J.M., Eckert K.A., Cheng K.C.. , Nucleic Acids Res. , 1996, vol. 24 (pg. 2429- 2434) CrossRef Search ADS PubMed 38 Eckert K.A., Kunkel T.A.. , PCR Methods Appl. , 1991, vol. 1 (pg. 17- 24) CrossRef Search ADS PubMed 39 Casares S., Ionov Y., Ge H.Y., Stanbridge E., Perucho M.. , Oncogene , 1995, vol. 11 (pg. 2303- 2310) PubMed 40 McCaffrey T.A., Du B., Consigli S., Szabo P., Bray P.J., Hartner L., Weksler B.B., Sanborn T.A., Bergman G., Bush H.L.. , J. Clin. Invest. , 1997, vol. 100 (pg. 2182- 2188) CrossRef Search ADS PubMed 41 Guillem J.G., Paty P.B., Cohen A.M.. , Cancer J. Clin. , 1997, vol. 47 (pg. 113- 128) CrossRef Search ADS 42 Day J., Bergstrom D., Hammer R., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 1810- 1818) CrossRef Search ADS PubMed 43 Day J., Hammer R., Bergstrom D., Barany F.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 1819- 1827) CrossRef Search ADS PubMed 44 Eggerding F.A., Iovannisci D.M., Brinson E., Grossman P., Winn-Deen E.S.. , Hum. Mutat. , 1995, vol. 5 (pg. 153- 165) CrossRef Search ADS PubMed 45 Muta H., Noguchi M., Perucho M., Ushio K., Sugihara K., Ochiai A., Nawata H., Hirohashi S.. , Cancer , 1996, vol. 77 (pg. 265- 270) CrossRef Search ADS PubMed 46 Horii A., Han H.J., Shimada M., Yanagisawa A., Kato Y., Ohta H., Yasui W., Tahara E., Nakamura Y.. , Cancer Res. , 1994, vol. 54 (pg. 3373- 3375) PubMed 47 Simms L.A., Zou T.T., Young J., Shi Y.Q., Lei J., Appel R., Rhyu M.G., Sugimura H., Chenevix-Trench G., Souza R.F., Meltzer S.J., Leggett B.A.. , Oncogene , 1997, vol. 14 (pg. 2613- 2618) CrossRef Search ADS PubMed 48 Ouyang H., Shiwaku H.O., Hagiwara H., Miura K., Abe T., Kato Y., Ohtani H., Shiiba K., Souza R.F., Meltzer S.J., Horii A.. , Cancer Res. , 1997, vol. 57 (pg. 1851- 1854) PubMed 49 Lengauer C., Kinzler K.W., Vogelstein B.. , Nature , 1997, vol. 386 (pg. 623- 627) CrossRef Search ADS PubMed 50 Rampino N., Yamamoto H., Ionov Y., Li Y., Sawai H., Reed J.C., Perucho M.. , Science , 1997, vol. 275 (pg. 967- 969) CrossRef Search ADS PubMed 51 Souza R.F., Appel R., Yin J., Wang S., Smolinski K.N., Abraham J.M., Zou T.T., Shi Y.Q., Lei J., Cottrell J., Cymes K., Biden K., Simms L., Leggett B., Lynch P.M., Frazier M., Powell S.M., Harpaz N., Sugimura H., Young J., Meltzer S.J.. , Nat. Genet. , 1996, vol. 14 (pg. 255- 257) CrossRef Search ADS PubMed 52 Yamamoto H., Sawai H., Perucho M.. , Cancer Res. , 1997, vol. 57 (pg. 4420- 4426) PubMed 53 Yamamoto H., Sawai H., Weber T.K., Rodriguez-Bigas M.A., Perucho M.. , Cancer Res. , 1998, vol. 58 (pg. 997- 1003) PubMed 54 Gerry N., Witowski N., Barany G., Barany F.. , J. Mol. Biol. , 1999, vol. 292 (pg. 251- 262) CrossRef Search ADS PubMed 55 Wenham P.R., Newton C.R., Price W.H.. , Clin. Chem. , 1991, vol. 37 (pg. 241- 244) PubMed 56 Zschocke J., Graham C.A.. , Mol. Cell. Probes , 1995, vol. 9 (pg. 447- 451) CrossRef Search ADS PubMed 57 Fortina P., Dotti G., Conant R., Monokian G., Parrella T., Hitchcock W., Rappaport E., Schwartz E., Surrey S.. , PCR Methods Appl. , 1992, vol. 2 (pg. 163- 166) CrossRef Search ADS PubMed 58 Baty D., Terron Kwiatkowski A., Mechan D., Harris A., Pippard M.J., Goudie D.. , J. Clin. Pathol. , 1998, vol. 51 (pg. 73- 74) CrossRef Search ADS PubMed 59 Scobie G., Woodroffe B., Fishel S., Kalsheker N.. , Mol. Hum. Reprod. , 1996, vol. 2 (pg. 203- 207) CrossRef Search ADS PubMed 60 Patel P., Lo Y.M., Bell J.I., Wainscoat J.S.. , J. Clin. Pathol. , 1993, vol. 46 (pg. 1105- 1108) CrossRef Search ADS PubMed 61 Ferrie R.M., Schwarz M.J., Robertson N.H., Vaudin S., Super M., Malone G., Little S.. , Am. J. Hum. Genet. , 1992, vol. 51 (pg. 251- 262) PubMed 62 O'Connell M.J., Schaid D.J., Ganju V., Cunningham J., Kovach J.S., Thibodeau S.N.. , Cancer , 1992, vol. 70 Suppl(pg. 1732- 1739) CrossRef Search ADS PubMed 63 Lothe R.A., Peltomaki P., Meling G.I., Aaltonen L.A., Nystrom-Lahti M., Pylkkanen L., Heimdal K., Andersen T.I., Moller P., Rognum T.O., et al. , Cancer Res. , 1993, vol. 53 (pg. 5849- 5852) PubMed 64 Sidransky D.. , Science , 1997, vol. 278 (pg. 1054- 1059) CrossRef Search ADS PubMed © 1999 Oxford University Press
A multiple-capillary electrophoresis system for smallscale DNA sequencing and analysisZhang, Jianzhong; Voss, Karl O.; Shaw, Diana F.; Roos, K. Pieter; Lewis, Darren F.; Yan, Juying; Jiang, Rong; Ren, Hongji; Hou, Joan Y.; Fang, Yu; Puyang, Xiaoling; Ahmadzadeh, Hossein; Dovichi, Norman J.
doi: N/Apmid: N/A
A five-capillary system has been developed for DNA sequencing and analysis. The post-column fluorescence detector is based on a sheath-flow cuvette. The instrument provides uniform and continuous illumination of the samples. The cuvette virtually eliminates cross-talk in the fluorescence signal between capillaries. Discrete single-photon counting avalanche photodiodes provide high efficiency light detection. The instrument has detection limits (3σ) of 130 ± 30 fluorescein molecules injected onto each capillary. Over 650 bases of sequence at 98.8% accuracy were generated in 100 min at 50°C from M13mp18. Separation and detection of short tandem repeats proved efficient and accurate with the use of internal standards for direct comparison of migration times between capillaries.
Ligase-based detection of mononucleotide repeat sequencesZirvi, Monib; Barany, Francis; Nakayama, Takamori; Newman, Gregg; McCaffrey, Timothy; Paty, Philip
doi: N/Apmid: N/A
Up to 15% of all colorectal cancers are considered to be replication error positive (RER+) and contain mutations at hundreds of thousands of microsatellite repeat sequences. Recently, a number of intragenic mononucleotide repeat sequences have been demonstrated to be targets for inactivating genes in RER+ colorectal tumors. In this study, thermostable DNA ligases were tested for the ability to detect alterations in microsatellite sequences in colon tumor samples. Ligation profiles on mononucleotide repeat sequences were determined for four related thermostable DNA ligases, Thermus thermophilus (Tth) ligase, Thermus sp. AK16D ligase, Aquifex aeolicus ligase and the K294R mutant of the Tth ligase.While the limit of detection for point mutations was one mutation in 1000 wild-type sequences, the ability to detect a single base deletion in a 10 base mononucleotide repeat was one mutation in 100 wild-type sequences. Furthermore, the misligation error increased exponentially as the length of the mononucleotide repeat increased, and was 10% of the correct signal for a 19 base mononucleotide repeat. A fluorescent ligase-based assay [polymerase chain reaction/ligase detection reaction (PCR/LDR)] correlated with results obtained using a radioactive assay to detect instability within the TGF-β Type II receptor gene. PCR/LDR was also used to detect the APCI1307K mononucleotide repeat allele which has a carrier frequency of 6.1%in Ashkenazi Jewish individuals. In a blind study, 30 samples that had been typed for the presence of the APCI1307K allele were tested. The PCR/LDR results correlated with those obtained using sequencing and allele-specific oligonucleotide hybridization for 16 samples carrying the mutation and 13 wild-type samples. Ligation assays that characterize mononucleotide repeats can be used to rapidly detect somatic mutations in tumors, and to screen for individuals who have a hereditary predisposition to develop colon cancer.
Comprehensive transcript analysis in small quantities of mRNA by SAGE-LitePeters, David G.;O'Hare, Elisa Heidrich;Ferrell, Robert E.;Kassam, Amin B.;Yonas, Howard;Brufsky, Adam M.
doi: 10.1093/nar/27.24.e39pmid: 10572191
Abstract Serial analysis of gene expression (SAGE) is a powerful technique that can be used for global analysis of gene expression. Its chief advantage over other methods is that SAGE does not require prior knowledge of the genes of interest and provides quantitative and qualitative data of potentially every transcribed sequence in a particular tissue or cell type. Furthermore, SAGE can quantify low-abundance transcripts and reliably detect relatively small differences in transcript abundance between cell populations. However, SAGE demands high input levels of mRNA which are often unavailable, particularly when studying human disease. To overcome this limitation, we have developed a modification of SAGE that allows detailed global analysis of gene expression in extremely small quantities of tissue or cultured cells. We have called this approach ‘SAGE-Lite’. This technique was used for the global analysis of transcription in samples of normal and pathological human cerebrovasculature to study the molecular pathology of intracranial aneurysms. These samples, which are obtained during operative surgical repair, are typically no bigger than 1 or 2 mm and yield <100 ng of total RNA. In addition, we show that SAGE-Lite allows simple and rapid isolation of long cDNAs from short (15 bp) SAGE sequence tags. Introduction The human genome contains ∼3 billion nucleotides, from which an estimated 100 000 functional genes are transcribed. Of these, it is thought that ∼11 000 are expressed in any one cell type (1) and that as many as 80% are present at a level of less than five copies per cell (2). Despite exponential growth in the use of molecular methods to study gene expression in normal and pathological tissue specimens, most efforts have focused on analyzing single or small numbers of genes. This has resulted in a plethora of non-overlapping, mutually exclusive, functional data, the majority of which has no relevance in a holistic sense. To fully understand the biological mechanisms controlling development, normal homeostasis and disease, it is vital that we are able to accurately quantify gene expression in a global context. This will provide insight into whole-cell responses to defined physiological conditions and establish foundations for further investigations into therapeutics, diagnostics and basic biology. One of the most pressing objectives in health sciences research, therefore, is the development of a robust technology for quantitative determination of global gene expression profiles in normal and pathological states. Although significant progress has been made in the characterization of large-scale transcriptional regulation in easily accessible model systems (3–7), fewer studies have utilized human pathological specimenswhich are often difficult to acquire and, in many instances, are limited in size and difficult to obtain fresh (2,8). There are a number of PCR-based techniques for the analysis of low levels of mRNA. Traditionally these have been applied to the quantitative or semi-quantitative characterization of single or small numbers of transcripts (9,10). Other techniques suitable for the isolation and analysis of genes from limiting quantities of mRNA include differential display (11) and subtractive hybridization (12). Although these techniques can be used for cloning differentially expressed genes, they do not provide quantitative measures of differential expression nor allow global analyses of transcript levels. More recently, nucleic acid hybridization-based technologies have been developed that are able to analyze multiple transcripts simultaneously (13). The major drawback of these approaches is that they only allow analysis of previously cloned sequences. An extremely powerful technique available for the global analysis of gene expression levels is serial analysis of gene expression (SAGE) (14), which allows a complete quantitative transcript analysis of a specific cell or tissue type. This is a comprehensive approach which relies on direct sequencing of gene-specific tags, rather than hybridization, and has an extremely high degree of accuracy for predicting quantitative relationships between transcript levels. This is particularly true when these are expressed at low abundance (2). Furthermore, its comprehensive (global) nature ensures that the relative abundance of any single transcript may be measured against every other transcript in the cell or tissue type. This makes it possible to accurately compare gene expression patterns between samples assayed at different times and to develop comprehensive databases of relative levels of transcript abundance in multiple tissue/cell types. The principal drawback of SAGE is the requirement for large quantities of high quality mRNA. This does not present a problem when cultured cells (3,15) or abundant tissue samples (2) are the subject of investigation. However, many studies demand analysis of gene expression patterns when mRNA and/or tissue abundance is limiting. These include small surgical samples, tissues which have been stored in sub-optimal conditions, and tissues (such as the vasculature) which have relatively low levels of transcription. It is clear that to fully utilize rapid advances in molecular biology we require the ability to accurately and comprehensively analyze small tissue samples in a variety of normal and pathological states. This is particularly relevant considering the heterogeneity and multigenic basis of many human diseases that will require individual molecular diagnosis for optimal management and intervention. In order to address the need for detailed comprehensive gene expression analysis we have developed a modified approach to SAGE which we have called SAGE-Lite. This method allows detailed global analysis of gene expression patterns from 50 ng of total RNA, thus providing an opportunity to study global patterns of gene expression to studies in which RNA is limited. Materials and Methods Cell culture and RNA extraction HT1080 cells were obtained from the American Type Tissue Culture Collection (ATCC). Cells were grown in Dulbecco's modified Eagle's medium and harvested directly in Trizol reagent used for total RNA isolation (Life Technologies). mRNA was isolated from this using the Messagemaker system (Life Technologies). Tissue acquisition and RNA extraction Cerebrovascular tissue samples were removed during surgery for repair of intracranial aneurysm (ICA). Samples were snapfrozen on dry ice and stored at −130°C until ready for use. Tissue was homogenized and total RNA extracted using the SV RNA isolation system (Promega). An aliquot of 100 ng of total RNA was used directly as a template for the synthesis of first strand cDNA. Synthesis of non-amplified double-stranded cDNA Un-amplified double-stranded cDNA was synthesized from purified mRNA using the cDNA Synthesis System (Life Technologies #18267-021). Reverse transcription (RT) and global PCR-amplification of cDNA First-strand cDNA was prepared using Superscript II (Life Technologies). Synthesis was primed using a biotinylated oligo(dT) primer (5′-AAGCAGTGGTAACAACGCAGAGTACT(30)VN-3′ [N = A, G, C or T; V = A, G or C] in the presence of a second, TS, oligonucleotide (5′-AAGCAGTGGTAACAACGCAGAGTACGCGGG) which acts as a second template during strand switching (16). For SAGE, first-strand cDNA was amplified using the Advantage cDNA Synthesis System (Clontech) according to the manufacturer's instructions using the same oligonucleotides employed during first-strand synthesis. For whole-cDNA dot blot, the same first-strand cDNA was amplified using a single primer (5-AAGCAGTGGTAACAACGCAGAGT). Dot blotting and hybridization Un-amplified and/or amplified cDNA pools from HT1080 cells or cerebrovascular tissue were denatured by heating at 95°C for 10min in a total volume of 500 µl and a final concentration of 0.4MNaOH, 0.1MEDTA (pH 7.5). cDNAs were then vacuum blotted onto Zeta Probe GT membrane (Bio-Rad), fixed by heating at 80°C for 2 h, hybridized overnight at 65°C in 0.5MNa2PO4, 7% SDS and washed in 40mMNa2PO4, 5% SDS at 65°C. SAGE SAGE was carried out as described previously (14). Cloning of SAGE-identified cDNAs SAGE tag sequences (11 bp) were used to design a 15 bp (11 bases plus NlaIII recognition sequence) oligonucleotide primer and its reverse complement which were used in separate PCR reactions with either the oligo(dT) primer or TS primer respectively. The resulting fragments were excised from agarose gels and cloned using the pCRScript kit (Stratagene). Single-gene RT-PCR mRNA was reverse transcribed (in duplicate) from total RNA using the Superscript II system (Life Technologies) according to the manufacturer's instructions. Control reactions were included which contained no reverse transcriptase to ensure that subsequent product amplification was reverse transcriptase-specific. Aliquots of resulting cDNAs were used in 25 µl PCR reactions, containing; 1× PCR buffer [50 mM KCl, 10 mM Tris-HCl (pH 9.0), 0.1% Triton X-100], 1.5 mM MgCl2, 0.25 µM each primer, 125 µM each dNTP, 5 µCi [α-32P]dCTP (3000 Ci/mmol) and 1 U Taq Polymerase (Promega). PCR reaction conditions were optimized for cycle number and input RNA to ensure product accumulation within the linear phase of amplification. PCR products were electrophoresed on 6% polyacrylamide gels in 1× TAE (0.04 M Tris-acetate, 0.001 M EDTA, pH 8.0), dried and autoradiographed directly. Results cDNA amplification to generate SAGE substrate In order to generate amplified libraries of cDNA for SAGE, we developed a modification of an approach described by Chenchik et al. (16) (Fig. 1). This method relies upon the fact that reverse transcriptase has inherent poly(C) terminal transferase activity and is able to switch templates during DNA polymerization (17–20). These properties are part of the retroviral life-cycle. First-strand cDNA synthesis by reverse transcriptase is primed by an oligo(dT) primer in the presence of a second ‘template switching’ oligonucleotide (TS oligo) which is included in the first-strand synthesis reaction. The TS oligo has a short poly(G) sequence at its 3′ end that is able to hybridize to the poly(C) sequence generated by reverse transcriptase. The reverse transcriptase is thus able to switch templates and incorporates the complementary TS sequence at its newly synthesized 3′ end. Every first-strand cDNA thus has a common sequence at its 3′ end which is complementary to the TS sequence. Figure 1 View largeDownload slide Schematic diagram of whole-cDNA PCR amplification. Modified from Chenchik et al. (16). See Materials and Methods. Figure 1 View largeDownload slide Schematic diagram of whole-cDNA PCR amplification. Modified from Chenchik et al. (16). See Materials and Methods. Figure 2 View largeDownload slide PCR-amplified cDNA. Input of total RNA for PCR is indicated (200, 500 or 1000 ng), as is cycle number. Reactions were performed in duplicate as indicated. Fragment sizes of the DNA ladder (M) are shown in kb. Figure 2 View largeDownload slide PCR-amplified cDNA. Input of total RNA for PCR is indicated (200, 500 or 1000 ng), as is cycle number. Reactions were performed in duplicate as indicated. Fragment sizes of the DNA ladder (M) are shown in kb. First-strand cDNA is then amplified by PCR using the oligo(dT) and TS primer. Since SAGE requires that first-strand cDNA be biotinylated at its 5′ end, a biotinylated oligo(dT) primer was used. This amplification resulted in libraries of cDNA molecules that varied in length from between ∼500 and 6000 bp (Fig. 2). Confirmation of representative amplification of cDNA Although under ideal conditions it is possible to exponentially amplify a single amplicon, quantitative PCR is fraught with technical difficulties that result in non-representative amplification. There are a number of reasons for this. PCR reagents may become limiting as the reaction proceeds, the thermo-stable DNA polymerase may partially degrade, and the increased concentration of template relative to primer may result in mis-priming. Quantitative or semi-quantitative PCR is, therefore, most accurate at lower cycle numbers during the exponential phase of amplification (21). Clearly, it is important that differences in SAGE tag abundance observed at the data analysis stage are not merely PCR artifacts. In order to determine whether our PCR strategy gave representative amplification of multiple cDNAs we performed a number of control amplifications from a single pool of total RNA. This was prepared from HT1080 cells and used to generate both PCR-derived cDNA pools (Materials and Methods), amplified under a variety of conditions, and double-stranded cDNA synthesized by standard techniques (22). Products of these reactions were analyzed by agarose gel electrophoresis (Fig. 2). As previously reported, we found that as cycle number increased, more fragments of >6 kb were visualized. This is diagnostic of ‘overcycling’ (16). Figure 3 View largeDownload slide Ratio of amplified to non-amplified cDNA. Double-stranded cDNA pools, generated either by PCR (16) or by traditional methods (22) were blotted onto nylon membrane and hybridized with radiolabeled gene-specific probes. Signals were quantified using a phosphorimager. An aliquot of 200 ng of total RNA was reverse transcribed and one-fifth of this used as a template for cDNAamplification by PCR. Note that the ratio of signal between amplified and un-amplified cDNA is relatively similar for all five markers, indicating representative amplification by PCR. Figure 3 View largeDownload slide Ratio of amplified to non-amplified cDNA. Double-stranded cDNA pools, generated either by PCR (16) or by traditional methods (22) were blotted onto nylon membrane and hybridized with radiolabeled gene-specific probes. Signals were quantified using a phosphorimager. An aliquot of 200 ng of total RNA was reverse transcribed and one-fifth of this used as a template for cDNAamplification by PCR. Note that the ratio of signal between amplified and un-amplified cDNA is relatively similar for all five markers, indicating representative amplification by PCR. We then used a dot blot approach to compare the copy numbers of specific genes in the PCR-derived cDNA with the copy numbers of the same genes derived from the traditionally synthesized cDNA. Whole-cell cDNA pools generated by these methods were dot-blotted and hybridized with a number of radiolabeled probes. If amplification is representative, the ratio of signal from amplified to un-amplified cDNA should be equivalent. Figure 3 shows that, at lower cycle numbers (22 cycles), the ratios of signal derived from cDNA amplified by PCR and cDNA synthesized in a traditional manner from abundant RNA are almost identical with respect to five single transcripts. This relationship between amplified and un-amplified signal was not observed at cycle numbers (26–34 cycles) beyond those recommended by the commercially available SMART™ user manual (data not shown). The SMART™ system is a cDNA synthesis and library construction kit marketed by Clontech (catalog #K1052-1). We found that a single 100 µl PCR reaction, containing one-fifth of the single-stranded product from a single 50 ng total RNA reverse transcription, yielded ∼5 µg of double-stranded cDNA. This is well within the recommended range of input for SAGE of 2–5 µg of poly(A)+ mRNA (14). Amplified cDNA can be used as a substrate for SAGE Following PCR amplification, cDNA was extracted with phenol/chloroform and precipitated, 10 mg of resuspended cDNA was digested with NlaIII and subjected to SAGE. Briefly, the 3′ ends of NlaIII-digested cDNA are purified by biotin capture onto streptavidin-coated paramagnetic beads and synthetic linkers are annealed to the exposed NlaIII site. Short gene-specific tags are then released by linker-directed digestion using a type IIS restriction enzyme (BsmF1) which cuts at a sequence-independent defined distance away from its recognition site. Tags generated from parallel reactions are then ligated to form ‘ditags’, amplified by PCR, and linker sequence removed by NlaIII digestion. These ditags are then ligated to form concatomers which are cloned and sequenced. Computer analysis is then performed to determine the relative abundance of gene-specific tags. Figure 4 View largeDownload slide (A) Lane 1, SAGE ditags after PCR amplification; lane 2, 100 bp marker. Fragments (102 bp) were gel-purified and digested with NlaIII (anchoring enzyme). (B) Lane 1, 25 bp marker; lane 2, SAGE ditags after NlaIII digestion. Fragments (24–26 bp) were gel-purified and concatomerized by ligation with T4 DNA ligase (Gibco BRL). (C) Lane 1, concatomerized ditags. Ditags of between 400 and 2000 bp were purified from the gel and cloned into pZero (Invitrogen). Lane 2, 100 bp marker. (D) PCR amplification of SAGE concatomers cloned in pZero. Insert sizes vary from 600 to 1500 bp. (E) SAGE ABI 377 sequence output. Anchoring enzyme sites (NlaIII) and ditags are represented by solid and dotted lines respectively. Figure 4 View largeDownload slide (A) Lane 1, SAGE ditags after PCR amplification; lane 2, 100 bp marker. Fragments (102 bp) were gel-purified and digested with NlaIII (anchoring enzyme). (B) Lane 1, 25 bp marker; lane 2, SAGE ditags after NlaIII digestion. Fragments (24–26 bp) were gel-purified and concatomerized by ligation with T4 DNA ligase (Gibco BRL). (C) Lane 1, concatomerized ditags. Ditags of between 400 and 2000 bp were purified from the gel and cloned into pZero (Invitrogen). Lane 2, 100 bp marker. (D) PCR amplification of SAGE concatomers cloned in pZero. Insert sizes vary from 600 to 1500 bp. (E) SAGE ABI 377 sequence output. Anchoring enzyme sites (NlaIII) and ditags are represented by solid and dotted lines respectively. Figure 4 demonstrates that double-stranded, amplified cDNA is a suitable substrate for SAGE. Concatomer length ranged between 600 and 1500 bp, and SAGE tags which matched GenBank entries were clearly derived, as predicted, from the most 3′ NlaIII recognition sequence (data not shown). Given our use of total RNA as a substrate for cDNA synthesis, it is notable that the percentage of sequence tags that match sequence derived from ribosomal RNA (rRNA) amongst the 50 most abundantly expressed found in our SAGE libraries (see below), was 4%. This compares favorably with at least two other studies published using traditional SAGE protocols in which the percentage of rRNA in the 50 most frequent tags was 12 (23) and 24% (24). Differential transcription analysis using SAGE-Lite ICA is a saccular dilation of an intracranial artery most often located at a branch point of a major artery on the Circle ofWillis at the base of the brain. Despite the catastrophic consequences of a ruptured ICA and their relatively high incidence, the molecular basis of the disease is not known. We therefore used SAGELite to study transcription in a small surgical sample taken from the dome of an ICA. As a control we used a small piece of superficial temporal artery (STA) removed at time of surgery from the same individual. From our SAGE-Lite analysis we compiled a comprehensive catalogue of genes which are expressed in ICA and STA. A total of 11 495 and 7297 tags were sequenced in the ICA and STA libraries and these corresponded to 4924 and 3552 distinct sequences respectively. There were 994 (8.6%) duplicate ditag dimers in the ICA library and 139 (1.9%) in the STA library. These were each counted only as monomers in tag abundance analyses. This correlates well with previous studies (3). Of the 100 most highly expressed genes in ICA, 31% were differentially regulated with a >5-fold change in expression relative to STA. Of these 100 genes, 22% are not listed in GenBank. Similarly, 29% of the 100 most highly expressed genes in STA are differentially regulated with a >5-fold change in expression relative to ICA. Of these 100 genes, 25% are not listed in GenBank. Detailed results of this approach will be published elsewhere. Figure 5 View largeDownload slide Col3A1 (A) and fibronectin (B) RT-PCR of cDNA derived from STA and ICA. Amplification was carried out in the presence of [α-32P]dCTP. Fragments were separated on a 5% polyacrylamide gel that was dried and visualized by autoradiography. cDNA source for these reactions was the same as that used for the global PCR shown in Figure 6B, in which GAPDH expression is equivalent. Figure 5 View largeDownload slide Col3A1 (A) and fibronectin (B) RT-PCR of cDNA derived from STA and ICA. Amplification was carried out in the presence of [α-32P]dCTP. Fragments were separated on a 5% polyacrylamide gel that was dried and visualized by autoradiography. cDNA source for these reactions was the same as that used for the global PCR shown in Figure 6B, in which GAPDH expression is equivalent. In order to confirm that SAGE-Lite is able to identify genuine differences in levels of transcription, we performed low cycle number PCR from the original first-strand cDNA using a single common primer (Materials and Methods). Amplified whole-cell cDNA pools were blotted onto nylon membrane and hybridized with gene-specific probes for Collagen 3A1 and a novel marker (see below). Table 1 demonstrates that relative levels of gene-specific transcript detected by SAGE-Lite are confirmed by this strategy. Differential expression of Collagen 3A1 and Fibronectin was also confirmed using semi-quantitative RT-PCR (Fig. 5). Table 1 View largeDownload slide Comparison of quantification of gene-specific transcription by SAGE-Lite and dot-blotting Ratios of expression between ICA and STA are shown as determined by each method. Table 1 View largeDownload slide Comparison of quantification of gene-specific transcription by SAGE-Lite and dot-blotting Ratios of expression between ICA and STA are shown as determined by each method. SAGE-Lite allows rapid cloning of novel genes A number of the SAGE tags identified using this approach represented genes that were not recorded in GenBank. In attempting to clone and characterize these unidentified tags we were able to take advantage of the fact that our PCR-amplified cDNA pool contained known sequences at either end. We used primers based on a 15 bp SAGE tag synthesized in both orientations along with the TS and oligo(dT) primers used for cDNA synthesis, to amplify a 900 bp cDNA whose expression was ∼3-fold greater in STA relative to ICA (this approach is summarized in Fig. 6A). This differential expression was confirmed when the novel fragment was used to probe a dot blot consisting of total cDNA from both ICA and STA (Fig. 6B). Using this approach we found that the expression of the novel cDNA varies 3-fold. This correlates exactly with the magnitude of change determined by SAGE-Lite. Discussion We have developed a novel modification of SAGE which we have called SAGE-Lite. SAGE-Lite is a robust approach to the comprehensive analysis of levels of transcription in tissues from which RNA is extremely limiting. It relies upon a modification of a previously described approach (16) in which first strand cDNA is globally amplified by PCR to generate large quantities of double-stranded cDNA. This was then utilized as a substrate for SAGE analysis. We sought to develop this strategy because of our interest in characterizing the molecular pathology of ICAs from which tissue samples are extremely small. Using SAGE-Lite, we identified a large number of genes which are differentially expressed in the dome of an ICA relative to STA and confirmed these changes in expression by both a cDNA dot-blotting strategy and semi-quantitative RT-PCR. Detailed analysis at this high level of resolution was previously impossible. By the application of SAGE-Lite we have generated a very large number of useful candidate genes for populationbased studies of ICA and gained considerable insight into the molecular basis of ICA. A full report of our SAGE results will be published elsewhere. Figure 6 View largeDownload slide (A) Rapid isolation of novel cDNA by PCR. First strand cDNA generated in the presence of the TS oligo (16) is used as a template. TS oligo was used in conjunction with the reverse complement (RC) of the SAGE tag (including the NlaIII site) for amplification. (B) Dot blot analysis of unknown mRNA expression in ICA and STA. GAPDH expression was also assayed to provide an internal control (to which the novel cDNA was normalized). Novel cDNA is expressed in STA at twice the level seen in ICA (see text). Figure 6 View largeDownload slide (A) Rapid isolation of novel cDNA by PCR. First strand cDNA generated in the presence of the TS oligo (16) is used as a template. TS oligo was used in conjunction with the reverse complement (RC) of the SAGE tag (including the NlaIII site) for amplification. (B) Dot blot analysis of unknown mRNA expression in ICA and STA. GAPDH expression was also assayed to provide an internal control (to which the novel cDNA was normalized). Novel cDNA is expressed in STA at twice the level seen in ICA (see text). Existing approaches to the problem of identifying differentially expressed genes which provide the most accurate quantitative analyses, such as traditional SAGE and conventional library screening are united in their demand for high levels of input RNA. These rely upon the generation of double-stranded cDNA by traditional approaches in which RNase H and DNA Polymerase I are used to prime the synthesis of a second cDNA strand after first-strand synthesis by reverse transcriptase (22). In situations where RNA is limiting, cDNA may be amplified by PCR via unique primer-binding sites present at both ends of the cDNA. Existing methods to generate cDNA with such sites include: homopolymer tailing on the 3′ end of the first-strand cDNA (25); single-stranded anchor ligation (either to the 5′ end of the mRNA or 3′ end of the first-strand cDNA) (26); and double-stranded adapter ligation to the 5′ of double-stranded cDNA (27). These are relatively long-winded, however, requiring multiple complex steps. Other methods such as differential display (11) and subtractive hybridization are suitable for use in situations where RNA is limiting but are only able to highlight qualitative differences in gene expression between samples and often result in a large number of false positive results (28). Another SAGE-based method to overcome the high input requirements of traditional SAGE called ‘MicroSAGE’ has recently been published (29). Unlike SAGE-Lite, in which small amounts of cDNA are amplified prior to SAGE,MicroSAGE [which utilizes 1–5 ng poly(A)+ mRNA] employs a two-step ditag amplification approach in which ditags are amplified for 28 cycles then gel-purified and re-amplified for as many as a further 18 cycles. This strategy generates enough material for subsequent tag concatomerization and library construction. In our hands, this two-phase ditag amplification strategy yielded large numbers of duplicate ditags (not shown). This problem was apparently not encountered in the MicroSAGE report (29). Duplicate ditags are thought to be the result of non-representative over-amplification and are generally counted as monomers in tag abundance analysis. Clearly their overrepresentation increases the amount of tag sequencing that must be undertaken. Similarly, a potential caveat with regard to SAGE-Lite is that the PCR amplification step does not generate libraries of cDNA which are representative. This could occur if certain gene-specific transcripts are more, or less, efficiently amplified than the transcripts of other genes. By comparing ratios of gene-specific transcripts in amplified and unamplified pools of cDNA we found that transcripts were amplified in a highly representative manner. Differences in gene expression were confirmed by dot blotting and RT-PCR. In order to minimize the possibility that the PCR step results in non-representative amplification, cycle number was kept to a minimum (16). In our experience, 18 cycles of PCR are sufficient to generate enough cDNA for SAGE-Lite (not shown). It is also possible that the PCR process favors amplification of shorter transcripts, thus excluding long sequences from subsequent detection and analysis. Qualitative evaluation of our SAGE data did not support this notion. Indeed, we found a number of SAGE tags derived from extremely long cDNAs based on GenBank database information. One of the most powerful features of ‘traditional’ SAGE is that it identifies tags from previously unidentified, novel, genes. Unfortunately, the conversion of a 14 or 15 bp tag into a full-length cDNA presents a significant technical challenge. However, as we have shown, SAGE-Lite greatly simplifies this problem. This is due to the fact that our cDNA libraries have known sequences at either end which are derived from the oligo(dT) and TS primers. A SAGE-Lite-derived tag may, therefore, be utilized to design a short primer which can be used in conjunction with either the TS or oligo(dT) primer to allow rapid recovery of near full-length cDNA (Fig. 6). Even if only partial cDNA sequence is obtained by PCR, this fragment greatly simplifies the isolation of full-length cDNA and genomic DNA by library screening. In summary, we have demonstrated that a modification of SAGE can be utilized to study transcription in studies where mRNA is extremely limited. We have named this approach ‘SAGE-Lite’. We were able to utilize SAGE to perform a detailed, comparitive, global analysis of gene expression from just 100 ng of total RNA extracted from normal and pathological human cerebrovascular tissue. Acknowledgements We are grateful to D. Biernesser for help with preparation of this manuscript. This work was supported by NIH grant No. HL44682 (R.E.F.), NIH grant No. KO8 CA67993 (A.M.B.) and institutional funding from the University of Pittsburgh Cancer Institute. References 1 Alberts B., Bray D., Lewis J., Raff M., Roberts K., Watson J.D.. Robertson M.. , Molecular Biology of the Cell , 1994 New York, NY Garlandpg. 369 2 Zhang L., Zhou W., Velculescu V.E., Kern S.E., Hruban R.H., Hamilton S.R., Vogelstein B., Kinzler K.W.. , Science , 1997, vol. 276 (pg. 1268- 1272) CrossRef Search ADS PubMed 3 Velculescu V.E., Zhang L., Zhou W., Vogelstein J., Basrai M.A., Bassett D.E.Jr, Hieter P., Vogelstein B., Kinzler K.. , Cell , 1997, vol. 88 (pg. 243- 251) CrossRef Search ADS PubMed 4 ishwanash R.I., Eisen M.B., Ross D.T., Schuler G., Moore T., Lee J.C.F., Trent J.M., Staudt L.M., Hudson J., Boguski M.S., Lashkari D., Shalon D., Botstein D., Brown P.. , Science , 1999, vol. 283 (pg. 83- 87) CrossRef Search ADS PubMed 5 Der S.D., Zhou A., Williams B.R.G., Silverman R.H.. , Proc. Natl Acad. Sci. USA , 1998, vol. 95 (pg. 15623- 15628) CrossRef Search ADS 6 Jelinsky S.A., Sampson L.D.. , Proc. Natl Acad. Sci. USA , 1999, vol. 96 (pg. 1486- 1491) CrossRef Search ADS 7 Zhu H., Cong J.-P., Mamtora G., Gingeras T., Shenk T.. , Proc. Natl Acad. Sci. USA , 1998, vol. 95 (pg. 14470- 14475) CrossRef Search ADS 8 Heller R.A., Schena M., Chai A., Shalon D., Bedilion T., Gilmore J., Woolley D., Davis R.W.. , Proc. Natl Acad. Sci. USA , 1997, vol. 94 (pg. 2150- 2155) CrossRef Search ADS 9 Veres G., Gibbs R.A., Scherer S.E., Caskey C.T.. , Science , 1987, vol. 237 (pg. 415- 417) CrossRef Search ADS PubMed 10 Gilliland G., Perrin S., Blanchard K., Bunn H.F.. , Proc. Natl Acad. Sci. USA , 1990, vol. 87 (pg. 8725- 8729) CrossRef Search ADS 11 Liang P., Pardee A.B.. , Science , 1992, vol. 257 (pg. 967- 971) CrossRef Search ADS PubMed 12 Sive H.L., St John T.. , Nucleic Acids Res. , 1988, vol. 16 pg. 10937 CrossRef Search ADS PubMed 13 Duggan D.J., Bittner M., Chen Y., Meltzer P., Trent J.M.. , Nature Genet. , 1999, vol. 21 suppl.(pg. 10- 14) CrossRef Search ADS 14 Velculescu V.E., Zhang L., Vogelstein B., Kinzler K.W.. , Science , 1995, vol. 270 (pg. 484- 487) CrossRef Search ADS PubMed 15 Scott H.S., Papasavvas P., Rossier C., Michaud J., Chrast R., Velculescu V.E., Dahoun S., Barras C., Buell G., Feger G., Antonarakis S.E.. , Am. J. Hum. Genet. , 1998, vol. 63 suppl.pg. 113 16 Chenchik A., Zhu Y.Y., Diatchenko L., Li R., Hill J., Siebert P.D.. Siebert P., Larrick J., Natick M.A.. , Gene Cloning and Analysis by RT-PCR , 1998 Natcik, MA Biotechniques Books(pg. 305- 319) 17 Clark J.M.. , Nucleic Acids Res. , 1988, vol. 16 (pg. 9677- 9686) CrossRef Search ADS PubMed 18 Hu W.S., Temin H.M.. , Science , 1990, vol. 250 (pg. 1227- 1233) CrossRef Search ADS PubMed 19 Kulpa D., Topping R., Telesnitsky A.. , EMBO J. , 1997, vol. 16 (pg. 856- 865) CrossRef Search ADS PubMed 20 Patel P.H., Preston B.D.. , Proc. Natl Acad. Sci. USA , 1994, vol. 91 (pg. 549- 553) CrossRef Search ADS 21 Higuchi R., Fockler C., Dollinger G., Watson R.. , Biotechnology , 1993, vol. 11 (pg. 1026- 1030) CrossRef Search ADS PubMed 22 Gubler U., Hoffman B.J.. , Gene , 1983, vol. 25 (pg. 263- 269) CrossRef Search ADS PubMed 23 Welle S., Bhatt K., Thornton C.A.. , Genome Res. , 1999, vol. 9 (pg. 506- 513) PubMed 24 De Waard V., van den Berg B.M.M., Veken J., Schultz-Heienbrok R., Pannekoek H., van Zonneveld A.-J.. , Gene , 1999, vol. 226 (pg. 1- 8) CrossRef Search ADS PubMed 25 Akowitz A., Manuelldis L.. , Gene , 1989, vol. 81 (pg. 295- 306) CrossRef Search ADS PubMed 26 Maruyama K., Sugano S.. , Gene , 1994, vol. 138 (pg. 171- 174) CrossRef Search ADS PubMed 27 Frohman M.A., Dush M.K., Martin G.R.. , Proc. Natl Acad. Sci. USA , 1988, vol. 85 (pg. 8998- 9002) CrossRef Search ADS 28 Debouck C.. , Curr. Opin. Biotechnol. , 1995, vol. 6 (pg. 597- 599) CrossRef Search ADS 29 Datson N.A., van der Perk-de Jong J., v an den Berg M.P., de Kloet E.R., reugdenhil E.. , Nucleic Acids Res. , 1999, vol. 27 (pg. 1300- 1307) CrossRef Search ADS PubMed © 1999 Oxford University Press