Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Spontaneous mutation spectra in supF: comparative analysis of mammalian cell line base substitution spectra

Spontaneous mutation spectra in supF: comparative analysis of mammalian cell line base... Abstract The last decade has seen a dramatic accumulation of mutation data from reporter genes utilized in mutagenesis experiments involving DNA reactive agents allowing comparisons for the mutagenic potential between many different mutagens. When analysing chemically induced mutation spectra it is important to establish the potential spontaneous background before drawing conclusions concerning specific chemically induced hotspots. A major mutation reporter system gene used in mammalian cells is the supF suppressor tRNA gene. The Mammalian Gene Mutation Database (MGMD) contains a considerable number of supF spontaneous mutations permitting a thorough analysis of spontaneous mutations in mammalian cell lines from different species and tissues. Analyses of spontaneous mutation spectra were performed using a range of statistical techniques. Spontaneous mutations were observed at 82.4% of the nucleotides in the supF suppressor tRNA sequence although the pattern of significant hotspots differed between cell lines. Our analyses of spontaneous mutation spectra show considerable variation both within and between cell lines for the distributions of spontaneous mutations occurring with no clear tissue or species-specific patterns emerging. In addition, spectra derived from supF recovered from liver and skin of transgenic mice, were similar to each other, but showed significant differences from many in vitro spectra. The most common base substitutions were G:C>TA transversions and G:C>A:T transitions, although levels of each type differed between cell lines. There was also variation between cell lines for the most mutable dinucleotides, however, significant hotspots were frequently observed at CpG sites and sequences containing GG/CC. We conclude that the number of varying distributions and potential hotspots for spontaneous mutations should thus be considered when comparing chemically induced mutation spectra in supF. The spectra presented here will be a useful reference for analysis and re-analysis of chemically induced spectra as well as for use in comparison with the spontaneous spectra of other gene systems. Introduction In order to gain insight into the underlying molecular events that lead to mutagenesis both in vitro and in vivo model systems have been developed to detect mutation events. Much mutant data is available for many environmental mutagens, allowing the generation of mutational spectra highlighting the mutation hotspots and coldspots in a target gene. Analysis of the specificity of a particular mutagen allows assumptions to be made as to the types of genetic lesions that a compound induces. Such assumptions can then be extrapolated to `cancer genes' and assessments made for carcinogenicity hazard and risk, although this area remains controversial (Vineis et al., 1999). Although much attention has been given to the potential of exogenous agents to produce mutations there has been little focus on the patterns of spontaneous events. The term `spontaneous' has been used rather loosely, but has generally been used to describe mutations that have arisen in the absence of any specified treatment (Glickman et al., 1994). The fact that all types of mutation can arise spontaneously and have presumed roles in carcinogenesis (Loeb, 1989) and ageing (Ames and Gold, 1991) highlights the importance of spontaneous mutagenesis. Furthermore, when analysing the mutagenic effects of exogenous compounds it makes sense to have as much understanding as possible of the spontaneous `background' that may contribute to the overall mutation spectrum. It is thought a combination of factors gives rise to spontaneous mutations, including errors in DNA replication, recombination and repair (Friedberg et al., 1995). DNA damage can also result from the intrinsic instability of DNA, such as deamination, as well as endogenous products of metabolism, including oxidants, alkylating agents, reactive nitrogen species, lipid peroxidation products and other intermediates of various metabolic pathways (Smith, 1992; Burcham, 1999). It has also been established that variations in experimental procedures used in model systems can affect the yields and types of spontaneous mutations observed (Smith, 1992). The Escherichia coli gene supF encodes a tyrosine amber suppressor tRNA molecule which has been adapted for the study of mutagenesis in several shuttle-vector plasmids (reviewed by Kraemer and Seidman, 1989). supF was first described as a shuttle-vector mutagenesis marker in the plasmids p3AC (Sarkar et al., 1984) and pZ189 (Seidman et al., 1985). These plasmids and their derivatives have been used extensively for assessing mutagenesis in mammalian cells. The transient shuttle-vectors can be treated with a DNA-damaging agent before or after transfection into mammalian cells. Active supF allows for the incorporation of tyrosine at the amber chain termination codons (UAG). After isolation from mammalian cells the most common assay for identifying supF mutants involves introduction of the shuttle-vector into a bacterial indicator strain carrying an amber mutation in the lacZ gene, which codes for β-galactosidase (Kraemer and Seidman, 1989). In this screening system functional supF allows read-through of the stop codon, resulting in blue colonies, whereas colonies with inactivated or partially active supF are white or light blue (Kraemer and Seidman, 1989). One of the main advantages of supF in shuttle-vector plasmids is the gene's extreme sensitivity to mutagenic inactivation. The repressor tRNA region is only 85 nt in length and a review by Canella and Seidman (2000) showed that base substitutions have been observed in all of these nucleotides. In addition, all six possible base substitution mutations can be monitored at all but 10 sites, allowing the detection of all three base substitutions that can occur at each base pair. Therefore, the selection bias associated with protein coding mutagenesis marker genes is largely avoided. The Mammalian Gene Mutation Database (MGMD) contains ~16 000 entries for supF DNA sequence information for mutants that have arisen in mammalian cell lines, of which nearly 900 are spontaneous base substitution data. Mutation data in MGMD can be rapidly accessed (http://www.listnweb.swan.ac.uk/cmgt/index.htm) and retrieved for comparison and analysis (Lewis et al., 2000). supF spontaneous mutation data is available for DNA repair-deficient human cell lines and DNA repair-proficient human, monkey and mouse cell lines, as well as for transgenic mice. The accumulation of such data permits the generation of spontaneous mutation spectra and profiles for supF in individual mammalian cell types. The identification of differences in the distributions and frequencies of spontaneous mutations in supF transfected into different cell types is a prerequisite when analysing and comparing the mutation spectra of exogenous compounds obtained using the same system. Potential spontaneous mutation differences between cell types could indicate underlying variations in methylation, cell metabolism and DNA repair capacity. Most importantly, when sufficient data is available the generation of spontaneous mutation spectra yields an estimate of the `background' that should be considered for each cell type before drawing conclusions from induced mutation spectra produced by mutagens and carcinogens. This report presents a comprehensive survey and analysis of spontaneous mutations recorded for the supF gene in mammalian cell lines, providing a reference and baseline for the study and comparison of induced mutation spectra of exogenous compounds in this gene. Materials and methods The mammalian gene mutation database The data presented in this paper was retrieved from the MGMD (Lewis et al., 2000). The database contains DNA sequence information on over 39 000 mutants from gene mutation assays involving the tyrosine suppressor tRNA (supF) gene, the hypoxanthine guanine phosphoribosyltransferase (hprt) gene, the adenine phosphoribosyltransferase (aprt) gene, the xanthine (guanine) phosphoribosyltransferase (gpt) gene and the dihydrofolate reductase (dhfr) gene. Mutants are classified as single base pair substitutions, tandem base pair substitutions (two adjacent base pair substitutions), multiple base pair substitutions (two non-adjacent base pair substitutions), deletions, insertions, duplications, rearrangements and complex (e.g. a base pair substitution and deletion in the same mutant) events. Currently, there are ~16 000 mutant entries for supF, of which over 1700 are spontaneous data. All the mutants analysed were assumed to be of independent origin. The supF gene consists of a 35 bp promoter, a 40 bp pre-tRNA region (having a function in RNA processing), an 85 bp tRNA sequence and a short 17 bp 3′-flanking sequence (Kraemer and Seidman, 1989). Cell types Spontaneous supF mutation data in MGMD were available for human (Ad293, GM606, GM0637, W138-VA13, XP12BE-SV, XAN1 and 2-0-A2), monkey (CV-1 and COS-7) and mouse (LN12) transformed cell lines (Table I). The data analysed also included supF base substitutions that had arisen in transgenic mice strains. Mutation spectra The majority of the supF mutants in the literature are recorded for the non-transcribed (NT) strand. Due to the structure of the supF gene and the alterations made to supF shuttle-vectors in the pre-tRNA region only mutations that arose in the suppressor tRNA region (nt 99–183) have been analysed in this study. The numbering system for the supF NT strand used here is that of Seidman et al. (1985). Mutation data can be copied directly from MGMD to a spreadsheet for simple generation of mutation spectra. Comparison of mutation spectra The hypergeometric test was used to test for significant differences in the distribution of mutations between spectra following the method described by Adams and Skopek (1987). The algorithm is available in a free user-friendly computer program (Cariello et al., 1994). Using this test, the probability of the observed results for a table of M spectra and N mutable sites is calculated. Monte Carlo simulation is then used to generate 10 000 random tables fixing the marginal totals for each to those observed in the experimental data. The proportion of random tables giving probabilities for the simulated results that are equal to or less than that for the observed results is taken as the probability of the observed results under the null hypothesis of no association in the observed table. The identification of hotspots within individual spectra was based on the approach described by Tarone (1989). The expected probability of a mutation at the site is calculated as the total number of mutations in the spectrum divided by the number of sites. The binomial distribution is used to calculate the a priori probability of the observed or a greater number of mutations at a given site. Monte Carlo simulation is then used to generate 10 000 random spectra using the same expected probabilities. The proportion of these spectra having at least one site significant at this a priori level is then the a posteriori significance level for that site. Obviously, the a priori probability value is numerically higher than the a posteriori level for a site and sites that are significant a priori are not always significant a posteriori. The a posteriori value takes account of the fact that even with random data lacking true hotspots some sites out of the many in a spectrum will generate significant a priori probability values just by chance. Thus use of the a posteriori values reduces the likelihood of a Type I, false positive error. FORTRAN programs for carrying out these tests are available on request from the authors. Comparison of mutation types The hypergeometric test was further used to compare base substitution frequencies between spectra. Dinucleotide relative mutability The likelihood of mutation within specific dinucleotides for each cell type was calculated as described in detail by Cooper and Krawczak (1990). Sixteen dinucleotides are possible in a DNA sequence and the frequency of each can be calculated for a given nucleotide length. For each dinucleotide the expected number of substitutions is equal to: dinucleotide frequency×total number of substitutions in all dinucleotides. One can then compare the observed versus the expected mutations for each dinucleotide. Furthermore, the mutability of each dinucleotide can be calculated relative to the least mutable dinucleotide (i.e. that presenting the least number of substitutions) by the equation: drm(d) = [O(d)˙E(least mutable d)˙E(d)] where drm(d) is the dinucleotide relative mutability, O(d) is the observed dinucleotide frequency and E(d) is the expected dinucleotide frequency. Dinucleotide relative mutabilities can then be ordered according to rank from the most mutable to the least mutable (Cooper and Krawczak, 1990), for example CG > TC > GG > ... AA > TA. Strand bias for base substitutions The statistical significance of differences between observed and expected base substitutions in the transcribed and NT strand was determined by χ2 analysis where:   \[{\chi}^{2}\ =\ {\sum}\frac{(obs\ {\mbox{--}}\ exp)^{2}}{exp}\] (with 1 degree of freedom). Results Comparison of mutation spectra A total of 837 supF tRNA sequence spontaneous base substitutions were found in MGMD that were derived from 20 individual publications (Table I). Initially, the mutation data from all publications were pooled to obtain a total mutation spectrum and to provide an overall insight into the spontaneously mutable regions of supF after transfection into mammalian cells (Figure 1A). Spontaneous mutations can be seen to be non-randomly distributed along the sequence, the majority of substitutions occurring at cytosines or guanines. The most mutable regions involved guanine tracts (G102–G105 and G122–G124), cytosine tracts (C108–C110) and CpG sites, particularly when found within a TCGA sequence (148–151, 154–157 and 162–165), where the level of mutations at the cytosine was increased relative to other CpG sites. Twenty-three nucleotides were significant mutation hotspots (at the 5% significance level): G102, G103, G104, C108, C109, G111, G115, G122, G123, G129, C133, C139, C146, C149, G150, C155, G156, G159, G160, C163, G164, C168 and C169. Sites showing no detectable mutations were A112, A119, A121, A128, T132, T145, A147, T148, A158, T170, T171, C178, C181 and A183. Table II shows that the most predominant type of spontaneous substitution was G:C→A:T (47.9%), followed by G:C→T:A (31.2%) and G:C→C:G (14.9%), although the overall number of transitions and transversions was similar (49.2 and 50.8%, respectively). The most frequent mutation hotspot, C133, falls within the supF anticodon sequence where C→G and C→A transversions change the sequence to code for tyrosine. Surprisingly, C→T transitions were also observed, even though this mutation still results in a stop codon, and all were single substitutions. The second most frequent hotspot, G123, was also the major hotspot in a spectrum (Figure 1B) derived for substitutions observed after single-stranded (transcribed strand) supF was transfected into the monkey kidney COS7 cell line (Cabral Neto et al., 1992, 1993; Madzak et al., 1992; Fronza et al., 1994). Virtually all substitutions observed in single-stranded supF were at cytosines. Three other major spontaneous hotspots, G156, G159 and G164, were also observed to be highly mutable in the single-stranded spectrum. Other mutable regions in the overall spontaneous spectrum were not detected in the single-stranded spectrum. In order to establish the heterogeneity of data derived from different sources and cell lines, the contribution of each publication to the total number of substitutions at each site was investigated (Figure 1C). There is variation between mutated nucleotides as to the number of individual spectra contributing to the level of mutations. For example, 16 individual spectra showed a substitution at G123 (one of the significant hotspots), whereas only one individual spectrum had a substitution at T107. In fact, there was strong positive correlation between the number of mutations per site and the number of individual spectra showing a mutation (r = 0.943, P < 0.0001). This resulted in a very similar (P = 0.991) distribution pattern for the spectra shown in Figure 1A and C. This could be expected given the different probability of mutation at each site and the variation in numbers of mutations in individual spectra (Table I). However, it was unknown whether the pooled mutation distribution was representative of all individual spectra or was comprised of distributions specific to cell lines, tissues or species. If the pooled spectrum was homogeneous then it should represent the true spontaneous supF mutational spectrum. Therefore, each individual spectrum was constructed for comparison (Figure 2). First, the spectra were compared by eye for similarities and differences. It was immediately apparent that some mutable sites were peculiar to a number of spectra, but not necessarily all. These sites tended to fall in the mutable regions described above for the overall spontaneous spectrum. However, many spectral comparisons appeared to reveal different mutation patterns and, site-for-site, no two spectra shared exactly the same distribution. Application of the Adams–Skopek test, for comparing the mutation distributions in two mutation spectra, confirmed that many pair-wise spectra comparisons (Table III) were significantly different (100/190). Table III reveals that no two spectra are the same for the pattern of how they differ from the other spectra. For example, although HF3 and HF4 have a statistically similar distribution (P = 0.548), HF3 and MK1 were significantly different, whereas HF4 and MK1 were not. The rank order of spectra showing the greatest number of significant differences to the least (when compared with all other spectra) was: (the numbers above each cell line indicate the number of times that cell line spectrum was significantly different in a pair-wise comparison). Spectra with a low number of mutations generally showed few significant differences with other spectra, however, there was no correlation between the number of mutations per spectrum and the number of significant pair-wise comparisons per spectrum (r = 0.158, P = 0.082). MK5 (COS-7) was significantly different to all other spectra, whereas the other COS-7 spectrum (MK4) was significantly different to only five others. Interestingly, there was no true underlying pattern that could be established for spectra either within or between different cell lines. Comparisons between human kidney cell line Ad293 (HK1–HK4) were all non-significant, but three of these spectra contained 13 or less mutations. Indeed, the probability value for the comparison between HK1 and HK2 is equal to 1.000, yet neither spectrum shares a common mutated site. This problem partially arises because all the mutated sites in both spectra have just one mutation and this is unavoidable when comparing spectra containing only a few mutations. The xeroderma pigmentosum (XP) derived fibroblast cell line spectra (HF3–HF5) also showed non-significant results when compared with each other. All other `within-cell line' comparisons showed at least one pair of spectra to be significantly different. The spectra derived from supF rescued from the liver and skin of transgenic mice were also very different from the remaining in vitro derived spectra, although both were highly similar to each other (P = 0.938). Importantly, there seemed to be no clear relationship between mutation distribution and the plasmid carrying the supF gene or the bacterial host used for mutant selection. In an effort to try and establish the basis for the lack of an underlying cell line pattern and why some spectra were so different to the rest (such as MK5 and HL1), the significant hotspots (P < 0.05) were determined for each individual spectrum (Table IV). This would allow the determination of potential randomly mutated sites from sites that could be considered true spontaneous hotspots. At the a priori probability level 27 hotspots (71.1%) were shared by at least two spectra and 11 hotspots (28.9%) were unique to individual spectra, although these were not necessarily rare mutation sites in other spectra. In addition, 12 hotspots that were significant in individual spectra were not significant after the data was pooled. Hotspot sites in all spectra were also mutable sites in at least one other spectrum. The MK5 spectrum contained one unique hotspot (G160) that comprised of 48.4% of the MK5 substitutions. HL1 was similar to only two other spectra, HF5 and HF6. HL1 and HF5 both had in common C133 as the most mutable hotspot, with C108 also being a major hotspot in both cases. HL1 and HF6, on the other hand, shared four significant hotspots that amounted to 46.5 and 46.9% of the mutations, respectively, for each spectrum. Two of the three CV-1 cell line spectra, MK2 and MK3, had a similar hotspot pattern. All eight MK3 hotspots were also hotspots for MK2 (which contained nearly four times as many mutations) including three, G103, G104 and G111, that were unique to that cell line, the latter being the most prominent hotspot for these two spectra but rarely mutated in others. The two transgenic mouse spectra MsS and MsL also differed considerably from most other spectra and showed some interesting patterns unique to these spectra. The CpG nucleotides at C114 and G115 were both significant hotspots in each spectrum, whereas they were only mutated at a low level, if at all, in all the other spectra. The most common hotspot, C133, found in 13 spectra and a significant hotspot in eight spectra had no mutations present in the transgenic mouse MsS and MsL spectra. The likelihood of a site erroneously being considered a true mutational hotspot is reduced by considering the number of significant hotspots at the a posteriori probability level. Table IV also shows a considerable reduction in the number of sites significant within each cell line at the a posteriori level. After this reduction only three hotspots, G123, C155 and G156, occur over all species, albeit not in all the cell lines/tissues of all species. Other groups of hotspots, G108 (also in MsF), G109, G129, C133 and C139, are apparent for certain human cell lines but not for the monkey cell lines. One significant hotspot (G159) was shared between monkey and mouse. Comparisons of mutation types A more consistent pattern emerged for the types of substitution occurring within each spectrum (Table II). G:C→A:T transitions were the most frequent mutation type in 70% (14/20) of spectra. Nine of these spectra (HK3, HL1, HF1, HF3, MK2, MK3, MK5, MsF and MsS) had a significantly higher (P < 0.05) level of G:C→A:T substitutions relative to other types of change. Conversely, G:C→T:A was predominant in HK3, HF2, HF6, MK5 and MsL (where HK3 and MK5 had significantly higher G:C→T:A relative to other substitution types), whereas G:C→C:G was most frequent in HK2. MK1 had the same percentage (23.1%) of G:C→A:T, G:C→T:A and A:T→T:A substitutions, but these were made up of only 13 mutations, suggesting that these frequencies may have occurred by chance. The majority of G:C→T:A transversions were distributed over GG/CC, TC/GA and CG dinucleotides and were not especially common to any particular hotspots. The frequencies of transitions and transversion for all spectra were compared pair-wise using the Adams–Skopek test (Table V). Predictably, in general those spectra with G:C→T:A and G:C→C:G mutations predominating were the most different compared with the remaining spectra. However, for those spectra where G:C→A:T transitions were most common an underlying pattern was apparent. For pair-wise comparisons between human cell line spectra only 7.1% (2/28) were significantly different. However, human and monkey pair-wise comparisons had a much higher number of differences, 43.8% (14/32), whereas 66.7% (4/6) of all within-monkey comparisons were significantly different. Also, 71.4% (5/7) of monkey and mouse comparisons and 33.3% (8/24) of human and mouse comparisons were significantly different. In mice there were no significant differences between any of the pair-wise comparisons. Importantly, these findings show that the levels of all kinds of substitutions present in human cell line spectra are homogeneous when G:C→A:T substitution predominates. The same is true for transgenic mouse spectra, but this is not the case for monkey spectra, which showed greater variation. Figure 1A reveals that GG/CC and GA/CT are the most frequently mutated dinucleotides, closely followed by CpG sites. Mutations at CpG sites make up ~25% of the total number. However, CpG dinucleotides collectively make up 43% of the total sequence. Therefore, nearest neighbour analysis (Cooper and Krawkzak, 1990) was performed on pooled and individual spectra to identify any underlying pattern as to the sequences where spontaneous mutations occur in supF. This approach takes into account the frequencies of all 16 possible dinucleotides within the sequence as well as the numbers of substitutions that occur and allows one to build rank orders of the most mutable compared with the least mutable dinucleotides for both individual and pooled spectra. The pooled and individual dinucleotide rank orders (Table VI) reveal variation between spectra as to where substitutions commonly occur. The most mutable dinucleotides in human cell lines were GA/TC (50.0%), GG/CC (33.3%), CpG (8.3%) and CT/AG (8.3%), in monkey cell lines they were CpG (60.0%), TA/AT (20.0%) and GT/AC (20.0%, due to the unusual G160 hotspot in MK5) and in mouse were CpG (66.7%) and TG/CA (33.3%). These dinucleotides (with the exception of TA) tend to always be found high up the rankings in most individual cases and explain the pattern observed in the pooled rank order. The least mutable dinucleotides over all spectra involved adjacent adenines and thymines, particularly AT, TA and AA/TT. The mutability of cytosine and guanine in CpG dinucleotides differed within the spectra. The level of cytosine and guanine mutations at all six CpG sites ranged between 3 and 39% over all spectra, with the majority of these mutations falling within the three TCpGA sites. The cytosines and guanines in GA/TC sites tended to be more spread over the sequence, probably reflecting the higher frequency and distribution of this dinucleotide pairing. The level of cytosine and guanine mutations in all three TCpGA sequences, given the general high mutability of GA/TC and CpG sites, suggests that these dinucleotides act synergistically, increasing the probability of mutation. It is also noteworthy that the third most mutable CpG site (C110–G111), which is also highly mutable in MK2 and MK3, lies within a CGA/TCG sequence. GG/CC substitutions were generally observed in guanine tracts (G102–G105 and G122–G124) and a cytosine tract (C108–C110). Surprisingly, very few GG/CC mutations were found in the GG/CC-rich region between nt 168 and 182. Strand bias Analysis of the individual spectra in Figure 2 reveals different levels of mutations at cytosines and guanines along the NT strand. For each individual spectrum cytosine and guanine frequencies were compared for strand bias (Table VII). Eight spectra show no strand bias, i.e. similar numbers of mutable cytosines and guanines to that expected. Four of these (HK2, HK3, MK1 and MK4) have less than 11 mutable cytosines and guanines, suggesting that the non-significant result could be due to chance. The other four spectra, HF1 and the three XP derived spectra HF3, HF4 and HF5, all have a higher number of mutated cytosines and guanines yet show no strand bias, indicating that within these cell lines there could be no preferential repair of either strand. The remaining 12 spectra all show a significant strand bias, of which 10 show a significant excess of substitutions at guanines. The other two spectra, HL1 and HF6, have an excess of substitutions at cytosines in the NT strand and show an exceptionally high frequency of substitutions at TC dinucleotides (see Table VI). Discussion The last 15 years have produced a multitude of publications describing mutations that have arisen in the supF tRNA sequence after treatment with many different mutagens and subsequent replication in mammalian cells. Many of these publications also describe mutations, mainly base substitutions, assumed to have arisen spontaneously in control experiments. The extensive collection of these mutations in the MGMD (Lewis et al., 2000) has allowed, for the first time, a survey of the distribution and types of spontaneous base substitutions found in supF when cultured in mammalian cell lines. One of the principal reasons for carrying out this present study was to identify potential differences in the patterns of spontaneous mutations in cell lines derived from different tissues and species. Identification of tissue-specific differences for spontaneous mutations would be of critical value when assessing the pattern of chemically induced mutations in different cell types. It would also aid in deciding whether pooling of data derived from different sources is legitimate. Previous studies have shown tissue-specific differences for the distribution and types of spontaneous mutation occurring in transgenic mouse. Harbach et al. (1999) found no differences between spontaneous spectra and hotspots for the cII locus in liver, lung and spleen, but G:C→A:T transversions were more abundant in liver and lung, whereas spleen showed a higher level of G:C→T:A. Ono et al. (1999) found G:C→A:T to be the most frequent substitution in the lacZ gene recovered from mouse liver, spleen and brain. The same results were obtained by de Boer et al. (1996) for the lacI gene in kidney, stomach and liver. A more extensive study by de Boer et al. (1998) of spontaneous substitutions of lacI in transgenic mice revealed G:C→A:T to be most frequent in lung, spleen, bladder, skin, stomach, kidney and liver. However, they also found that G:C→A:T occurred at equal frequency to G:C→T:A in bone marrow. Douglas et al. (1994) also showed G:C→A:T to be the most abundant substitution type in the lacZ gene in mouse liver. Taking all data from this present study into account, however, there were no apparent tissue-specific patterns or differences for spontaneous supF base substitutions in cultured mammalian cells. The overall conclusion to be drawn from this set of mutation data is that there is considerable variation between cell lines in the actual spontaneous mutation distribution that can arise within the 85 bp supF tRNA sequence. To some extent this was enhanced by the variation in numbers of mutations contributing to each spectrum. When comparing the mutation distribution alone there was no homogeneity within cell lines, with many different hotspots apparent. The transgenic mouse liver and skin distributions were both highly different to the cell line spectra (Table III), suggesting that the mechanisms of spontaneous mutagenesis, including cause and/or processing of certain adducts, could be different to that found in cultured mammalian cells, at least for base substitutions. However, mutable regions did stand out over nearly all spectra, particularly those involving GG/CC and TCpGA, showing the higher spontaneous mutability of these sequences. Although there was much variation in the different supF mutation distributions, there were evident patterns in the types of substitutions occurring in the cell lines and the sequences being targeted. G:C→A:T transitions predominated in over half the individual spectra, followed by G:C→T:A transversions, which were also the most frequent in the remaining spectra (except HK2). Akasaka et al. (1992) also found G:C→T:A to be the most frequent spontaneous substitution in E.coli transformed with supF. Also, a pattern existed as to the sequences where spontaneous mutations arose in cell lines originating from different species, with GA/TC and GG/CC being more mutable in human than monkey and mouse cells, where CpG sites were most mutable. There was also a pattern in strand bias for guanine mutations in the NT strand of 12 of the spectra, but this was not tissue specific. However, the fact that two spectra showed a NT strand bias for cytosine mutations and eight spectra showed no strand bias for mutations at either nucleotide suggests that differences can exist both within and between cell lines as to the causes, processing and repair of adducts that result in spontaneous mutations. Human tRNA genes are transcribed by RNA polymerase III and previous research shows that these genes are not associated with transcription-coupled repair (TCR), the primary source of a strand bias for mutations (Dammann and Pfeifer, 1997). In addition, the supF gene is not transcribed in mammalian cells, therefore, a surplus of mutations at guanines or cytosines in one strand observed in the supF data could result from different types or levels of endogenous factors. Endogenous factors that ultimately cause spontaneous mutations can be many and combinations of these factors in different cell lines in different laboratories could also lead to the variation observed for the supF spectra. Endogenous factors with direct or indirect DNA-damaging potential include oxidants, lipid peroxidation products, alkylating agents, oestrogens, chlorinating agents, reactive nitrogen species and certain intermediates of various metabolic pathways (Burcham, 1999). Added to this list could be chemical or physical mutagens to which the supF plasmid could be exposed within a laboratory. Oxidants probably have the greatest potential for causing spontaneous mutations (reviewed by Burcham, 1999) and could also cause variation in where and what types of base substitutions occur along the mutation spectrum. For example, products of cytosine oxidation are likely precursors of G:C→A:T transitions, which are abundant in the supF spectra described here. This would also explain the high number of spontaneous mutations at cytosines observed when single-stranded supF was transfected into a monkey cell line (Cabral Neto et al., 1992, 1993; Madzak et al., 1992; Fronza et al., 1994). Oxidation of guanine can result in G:C→T:A transversions, which were also abundant in the supF spectra. DNA damage by oxidants could be further enhanced by a compromised level of antioxidants within a cell line. Spontaneous deamination of cytosine to uracil would explain the existence of hotspots at CpG sites. It is also interesting that deamination of cytosine to uracil is 250-fold more frequent in single-stranded than double-stranded DNA (Madzak et al., 1992). The nature of the sequence of the supF tRNA gene suggests that there is great potential for hairpin loops to form, leaving certain nucleotides in single-stranded regions where deamination of cytosine would lead to G:C→A:T transitions. In addition, the efficiency of trans-lesion synthesis and 3′→5′ exonucleolytic proofreading of endogenous lesions depends upon the specific DNA polymerase that processes groups of lesions (Eckert and Opresko, 1999). Thus, spontaneous mutation spectra will ultimately vary given the extent and interplay of the different factors that cause spontaneous mutations. Smith (1992), in his review of spontaneous mutagenesis, stated that `it is not widely appreciated that the types and frequencies of spontaneous mutations change markedly with subtle changes in experimental conditions'. There is always a small possibility that mutations could arise spontaneously in supF in the bacterial host used for selecting mutants. However, Hauser et al. (1986) showed, through a series of duplicate experiments involving transformation of E.coli MBM7070 with pZ189 vector after recovery from mammalian cells, that the mutations all arose in mammalian cells and not the bacteria. However, we observed that the spectrum for MK5 (monkey COS7 cells), which was highly significantly different from all other spectra in this study, was similar to the spontaneous spectrum for supF after transformation of E.coli KS40 cells (Akasaka et al., 1992) without prior transfection of mammalian cells. This was particularly the case for the hotspot at G160. In fact, comparison of the two spectra by the Adams–Skopek test revealed a P value of 0.03, which although significant was a much higher value than those observed in all other comparisons involving MK5. It is interesting that both studies utilized the KS40 strain (a derivative of MBM7070) and might suggest that the mutations at G160 of MK5 arose within the bacteria. This supports the argument against pooling spontaneous mutation data. Whether or not it is legitimate to pool any of the spontaneous mutation data is still debatable. Pooling of within-tissue or within-species data would result in spectra biased by the individual spectrum contributing most of the mutations. The results of spectra comparisons in this study using the Adams–Skopek test argue against pooling spectra. A useful additional approach to analysing mutation spectra would be a system where spectra comparisons result in a similarity score. For example, two identical spectra would show a score of, say, 1, whereas two unrelated spectra would have a score near to 0. This would allow groups of spectra to be analysed in a multi-dimensional manner revealing the potential inter-relationships between spectra within and between tissues and species. An alternative approach recently described involves collapsing similar profiles into homogeneous clusters to reveal their similarity patterns (Khromov-Borisov et al., 1999). The spontaneous spectra described here can be used as a major reference source in a number of ways. Primarily, the individual spectra should be used for comparison of supF spectra generated after treatment with chemical and physical agents. This will assist in distinguishing mutagen-prone hotspots from potential spontaneous background mutations. A clear example involves the comparison of the spontaneous spectra shown here and the supF mutation spectra generated after treatment with the four stereoisomers of benzo[c]phenanthrene 3,4-dihydrodiol 1,2-epoxide (B[c]PhDE) (Bigger et al., 2000). Analysis of the four chemically induced mutation spectra show C133 to be a major hotspot, yet this is also the most frequent spontaneously mutated site. Further analysis, however, reveals that after B[c]PhDE treatment the predominent type of substitution at C133 is G:C→T:A, whereas G:C→C:G is the most frequent spontaneous mutation at this site. Conversely, A112 is a major hotspot for three of the stereoisomers, yet no spontaneous mutation has been observed at this site, showing the specificity of the stereoisomers for this particular sequence. Mutagen-induced supF spectra can thus now readily be compared with a panel of spontaneous supF mutation spectra for hotspot assessment and deduction of the underlying potential background. We recommend that mutagen-induced mutation spectra should be compared with spontaneous (background) spectra of the same cell line but also that consideration is given to all of the spontaneous mutation spectra presented here, given the variability in distribution of supF spontaneous mutations within and between the different cell lines and the lack of any clear cell line/tissue/species mutation patterns. In addition to comparisons with chemically induced mutation spectra, the extent of the spontaneous mutation data described here will also be a useful resource for determination of the various causes and sequence specificities of spontaneous mutations, especially in mammalian cell lines. In particular, comparisons of the spontaneous mutation spectra with those arising after treatment with oxidizing agents would give insight into the contribution of oxidative damage to the overall spontaneous mutation spectrum. Finally, the spontaneous supF mutation data may be compared with similar data from other genes widely used in in vitro mutagenesis studies, such as hprt, obtainable from The MGMD (Lewis et al., 2000), as well as mutations known to cause inherited human disease (Krawczak and Cooper, 1997). Table I. supF spontaneous mutant data derived from 20 sources available in the MGMD (Lewis et al., 2000) Cell line  Code  Species  Origin  Plasmid  No. of base substitutions  Mutation frequency (×10–4)  Reference  Data were first sorted by cell line and then separated into within cell line subgroups. For clarity and ease of analysis each cell line subgroup was coded, where the first letter describes the species and the second letter describes the tissue of origin. All cell lines are assumed to be nucleotide excision repair (NER) proficient with the exception of XP12BE-SV (HF3) and 2-0-A2 (HF5), expressing the XP-A phenotype, rendering them extremely sensitive to UV and chemical DNA-damaging agents. H, human; M, monkey; Ms, mouse; kid, kidney; lym, lymphoblast; fib, fibroblast.  aMutation frequency recorded as a percentage by authors.  Ad293  HK1  H  kid  pS189  5  0.5  Bigger et al. (1990)  Ad293  HK2  H  kid  pS189  9  0.2  Bigger et al. (1992)  Ad293  HK3  H  kid  pS189  13  0.8  Boldt et al. (1991)  Ad293  HK4  H  kid  pSP189  50  1.1  Jeudes and Wogan (1996)  GM606  HL1  H  lym  pZ189  108  1.1  Jaberaboansari et al. (1991)  GM606  HL2  H  lym  pZ189  44  1.0  Sikpi et al. (1991)  GM0637  HF1  H  fib  pSP189  31  0.1–0.3%a  Myrand et al. (1996)  GM0637B  HF2  H  fib  pZ189  37  0.07%a  Seidman et al. (1987)  XP12BE-SV  HF3  H  fib  pSP189  30  0.1–0.3%a  Myrand et al. (1996)  XAN1  HF4  H  fib  pSP189  22  0.1–0.3%a  Myrand et al. (1996)  2–0-A2  HF5  H  fib  pSP189  19  0.1–0.3%a  Myrand et al. (1996)  W138-VA13  HF6  H  fib  pMY189  32  N/A  Kawanishi et al. (1998)  CV-1  MK1  M  kid  pZ189  13  11.1  Akman et al. (1991)  CV-1  MK2  M  kid  pZ189  211  0.03%a  Moraes et al. (1989)  CV-1 (TC-7)  MK3  M  kid  pZ189  56  3.7  Hauser et al. (1987)  COS-7  MK4  M  kid  p3AC  9  1.4  Yang et al. (1987)  COS-7  MK5  M  kid  pMY189  31  8.2  Murata-kamiya et al. (1997)  LN12  MsF  Ms  fib  λsupF  32  3.2  Leach et al. (1996)  Transgenic  MsS  Ms  skin  λsupF  42  0.2  Leach et al. (1996)  Transgenic  MsL  Ms  liver  λsupF  46  0.2  Leach et al. (1996)  Cell line  Code  Species  Origin  Plasmid  No. of base substitutions  Mutation frequency (×10–4)  Reference  Data were first sorted by cell line and then separated into within cell line subgroups. For clarity and ease of analysis each cell line subgroup was coded, where the first letter describes the species and the second letter describes the tissue of origin. All cell lines are assumed to be nucleotide excision repair (NER) proficient with the exception of XP12BE-SV (HF3) and 2-0-A2 (HF5), expressing the XP-A phenotype, rendering them extremely sensitive to UV and chemical DNA-damaging agents. H, human; M, monkey; Ms, mouse; kid, kidney; lym, lymphoblast; fib, fibroblast.  aMutation frequency recorded as a percentage by authors.  Ad293  HK1  H  kid  pS189  5  0.5  Bigger et al. (1990)  Ad293  HK2  H  kid  pS189  9  0.2  Bigger et al. (1992)  Ad293  HK3  H  kid  pS189  13  0.8  Boldt et al. (1991)  Ad293  HK4  H  kid  pSP189  50  1.1  Jeudes and Wogan (1996)  GM606  HL1  H  lym  pZ189  108  1.1  Jaberaboansari et al. (1991)  GM606  HL2  H  lym  pZ189  44  1.0  Sikpi et al. (1991)  GM0637  HF1  H  fib  pSP189  31  0.1–0.3%a  Myrand et al. (1996)  GM0637B  HF2  H  fib  pZ189  37  0.07%a  Seidman et al. (1987)  XP12BE-SV  HF3  H  fib  pSP189  30  0.1–0.3%a  Myrand et al. (1996)  XAN1  HF4  H  fib  pSP189  22  0.1–0.3%a  Myrand et al. (1996)  2–0-A2  HF5  H  fib  pSP189  19  0.1–0.3%a  Myrand et al. (1996)  W138-VA13  HF6  H  fib  pMY189  32  N/A  Kawanishi et al. (1998)  CV-1  MK1  M  kid  pZ189  13  11.1  Akman et al. (1991)  CV-1  MK2  M  kid  pZ189  211  0.03%a  Moraes et al. (1989)  CV-1 (TC-7)  MK3  M  kid  pZ189  56  3.7  Hauser et al. (1987)  COS-7  MK4  M  kid  p3AC  9  1.4  Yang et al. (1987)  COS-7  MK5  M  kid  pMY189  31  8.2  Murata-kamiya et al. (1997)  LN12  MsF  Ms  fib  λsupF  32  3.2  Leach et al. (1996)  Transgenic  MsS  Ms  skin  λsupF  42  0.2  Leach et al. (1996)  Transgenic  MsL  Ms  liver  λsupF  46  0.2  Leach et al. (1996)  View Large Table II. Types of spontaneous base substitutions in supF observed for each cell line and subgroup   Transitions  Transversions    GC→AT  AT→GC  GC→CG  GC→TA  AT→CG  AT→TA  The initial figures are the percentage of that type of substitution observed for that cell line/subgroup, whereas the following figure in parentheses is the actual number of substitutions observed from that experiment. Figures in bold show the most frequent type of base substitution observed.  HK1  80.0 (4)  0 (0)  0 (0)  0 (0)  0 (0)  20.0 (1)  HK2  0 (0)  0 (0)  44.4 (4)  33.3 (3)  0 (0)  22.2 (2)  HK3  7.7 (1)  0 (0)  7.7 (1)  69.2 (9)  15.4 (2)  0 (0)  HK4  46.0 (23)  0 (0)  14.0 (7)  34.0 (17)  2.0 (1)  4.0 (2)  HL1  61.1 (66)  0 (0)  14.8 (16)  23.1 (25)  0.9 (1)  0 (0)  HL2  50.0 (22)  0 (0)  15.9 (7)  31.8 (14)  0 (0)  2.3 (1)  HF1  67.7 (21)  0 (0)  6.5 (2)  19.4 (6)  3.2 (1)  3.2 (1)  HF2  27.0 (10)  2.7 (1)  24.3 (9)  43.2 (16)  0 (0)  2.7 (1)  HF3  56.7 (17)  0 (0)  10.0 (3)  23.3 (7)  10 (3)  0 (0)  HF4  50.0 (11)  0 (0)  13.6 (3)  36.4 (8)  0 (0)  0 (0)  HF5  47.4 (9)  0 (0)  31.6 (6)  15.8 (3)  5.3 (1)  0 (0)  HF6  28.1 (9)  3.1 (1)  21.9 (7)  46.9 (15)  0 (0)  0 (0)  MK1  23.1 (3)  0 (0)  15.4 (2)  23.1 (3)  15.4 (2)  23.1 (3)  MK2  48.8 (103)  0.5 (1)  19.0 (40)  30.3 (64)  0.5 (1)  0.9 (2)  MK3  64.3 (36)  0 (0)  3.6 (2)  30.4 (17)  0 (0)  1.8 (1)  MK4  44.4 (4)  11.1 (1)  11.1 (1)  11.1 (1)  22.2 (2)  0 (0)  MK5  3.2 (1)  0 (0)  32.3 (10)  64.5 (20)  0 (0)  0 (0)  MsF  65.6 (21)  6.3 (2)  3.1 (1)  21.9 (7)  3.1 (1)  0 (0)  MsS  66.7 (28)  4.8 (2)  4.8 (2)  14.3 (6)  7.1 (3)  2.4 (1)  MsL  37.0 (17)  6.5 (3)  2.2 (1)  39.1 (18)  10.9 (5)  4.3 (2)  Total  47.9 (401)  1.3 (11)  14.9 (125)  31.2 (261)  2.6 (22)  2.0 (17)    Transitions  Transversions    GC→AT  AT→GC  GC→CG  GC→TA  AT→CG  AT→TA  The initial figures are the percentage of that type of substitution observed for that cell line/subgroup, whereas the following figure in parentheses is the actual number of substitutions observed from that experiment. Figures in bold show the most frequent type of base substitution observed.  HK1  80.0 (4)  0 (0)  0 (0)  0 (0)  0 (0)  20.0 (1)  HK2  0 (0)  0 (0)  44.4 (4)  33.3 (3)  0 (0)  22.2 (2)  HK3  7.7 (1)  0 (0)  7.7 (1)  69.2 (9)  15.4 (2)  0 (0)  HK4  46.0 (23)  0 (0)  14.0 (7)  34.0 (17)  2.0 (1)  4.0 (2)  HL1  61.1 (66)  0 (0)  14.8 (16)  23.1 (25)  0.9 (1)  0 (0)  HL2  50.0 (22)  0 (0)  15.9 (7)  31.8 (14)  0 (0)  2.3 (1)  HF1  67.7 (21)  0 (0)  6.5 (2)  19.4 (6)  3.2 (1)  3.2 (1)  HF2  27.0 (10)  2.7 (1)  24.3 (9)  43.2 (16)  0 (0)  2.7 (1)  HF3  56.7 (17)  0 (0)  10.0 (3)  23.3 (7)  10 (3)  0 (0)  HF4  50.0 (11)  0 (0)  13.6 (3)  36.4 (8)  0 (0)  0 (0)  HF5  47.4 (9)  0 (0)  31.6 (6)  15.8 (3)  5.3 (1)  0 (0)  HF6  28.1 (9)  3.1 (1)  21.9 (7)  46.9 (15)  0 (0)  0 (0)  MK1  23.1 (3)  0 (0)  15.4 (2)  23.1 (3)  15.4 (2)  23.1 (3)  MK2  48.8 (103)  0.5 (1)  19.0 (40)  30.3 (64)  0.5 (1)  0.9 (2)  MK3  64.3 (36)  0 (0)  3.6 (2)  30.4 (17)  0 (0)  1.8 (1)  MK4  44.4 (4)  11.1 (1)  11.1 (1)  11.1 (1)  22.2 (2)  0 (0)  MK5  3.2 (1)  0 (0)  32.3 (10)  64.5 (20)  0 (0)  0 (0)  MsF  65.6 (21)  6.3 (2)  3.1 (1)  21.9 (7)  3.1 (1)  0 (0)  MsS  66.7 (28)  4.8 (2)  4.8 (2)  14.3 (6)  7.1 (3)  2.4 (1)  MsL  37.0 (17)  6.5 (3)  2.2 (1)  39.1 (18)  10.9 (5)  4.3 (2)  Total  47.9 (401)  1.3 (11)  14.9 (125)  31.2 (261)  2.6 (22)  2.0 (17)  View Large Table III. The probabilities of significant differences for each pair-wise comparison of all spontaneous mutation spectra after application of the Adams–Skopek test for comparing the mutation distribution in two mutation spectra Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001. The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    1.000  0.548  0.156  0.001  0.308  0.038  0.001  0.352  0.481  0.296  0.001  0.771  0.151  0.551  0.384  0.001  0.477  0.089  0.071  HK2  1.000    0.981  0.271  0.024  0.767  0.932  0.875  0.568  0.278  0.898  0.111  0.000  0.114  0.702  1.000  0.002  0.742  0.075  0.144  HK3  0.548  0.981    0.135  0.024  0.301  0.376  0.032  0.578  0.402  0.472  0.023  0.011  0.012  0.225  0.766  0.006  0.370  0.096  0.082  HK4  0.156  0.271  0.135    0.000  0.036  0.041  0.061  0.010  0.356  0.347  0.004  0.000  0.000  0.003  0.000  0.000  0.069  0.000  0.000  HL1  0.001  0.024  0.024  0.000    0.000  0.000  0.000  0.000  0.040  0.070  0.388  0.000  0.000  0.000  0.019  0.000  0.020  0.000  0.000  HL2  0.308  0.767  0.301  0.036  0.000    0.099  0.233  0.107  0.293  0.322  0.012  0.031  0.170  0.664  0.208  0.000  0.747  0.002  0.037  HF1  0.038  0.932  0.376  0.041  0.000  0.099    0.013  0.000  0.002  0.541  0.059  0.047  0.000  0.241  0.110  0.000  0.077  0.002  0.026  HF2  0.001  0.875  0.032  0.061  0.000  0.233  0.013    0.772  0.644  0.622  0.001  0.001  0.003  0.221  0.142  0.000  0.006  0.000  0.001  HF3  0.352  0.568  0.578  0.010  0.000  0.107  0.000  0.772    0.548  0.621  0.067  0.016  0.000  0.757  0.316  0.000  0.273  0.000  0.005  HF4  0.481  0.278  0.402  0.356  0.040  0.293  0.002  0.644  0.548    0.634  0.239  0.206  0.004  0.169  0.782  0.000  0.099  0.000  0.004  HF5  0.296  0.898  0.472  0.347  0.070  0.322  0.541  0.622  0.621  0.634    0.240  0.022  0.248  0.005  0.634  0.000  0.049  0.000  0.014  HF6  0.001  0.111  0.023  0.004  0.388  0.012  0.059  0.001  0.067  0.239  0.240    0.001  0.002  0.001  0.151  0.000  0.023  0.000  0.000  MK1  0.771  0.000  0.011  0.000  0.000  0.031  0.047  0.001  0.016  0.206  0.022  0.001    0.000  0.008  0.005  0.000  0.083  0.005  0.010  MK2  0.151  0.114  0.012  0.000  0.000  0.170  0.000  0.003  0.000  0.004  0.248  0.002  0.000    0.601  0.082  0.000  0.019  0.000  0.000  MK3  0.551  0.702  0.225  0.003  0.000  0.664  0.241  0.221  0.757  0.169  0.005  0.001  0.008  0.601    0.147  0.000  0.732  0.022  0.042  MK4  0.384  1.000  0.766  0.000  0.019  0.208  0.110  0.142  0.316  0.782  0.634  0.151  0.005  0.082  0.147    0.008  0.487  0.014  0.189  MK5  0.001  0.002  0.006  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.008    0.000  0.000  0.000  MsF  0.477  0.742  0.370  0.069  0.020  0.747  0.077  0.006  0.273  0.099  0.049  0.023  0.083  0.019  0.732  0.487  0.000    0.007  0.287  MsS  0.089  0.075  0.096  0.000  0.000  0.002  0.002  0.000  0.000  0.000  0.000  0.000  0.005  0.000  0.022  0.014  0.000  0.007    0.938  MsL  0.071  0.144  0.082  0.000  0.000  0.037  0.026  0.001  0.005  0.004  0.014  0.000  0.010  0.000  0.042  0.189  0.000  0.287  0.938      5  3  6  12  17  7  11  11  8  6  6  12  16  13  8  5  19  7  15  13  Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001. The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    1.000  0.548  0.156  0.001  0.308  0.038  0.001  0.352  0.481  0.296  0.001  0.771  0.151  0.551  0.384  0.001  0.477  0.089  0.071  HK2  1.000    0.981  0.271  0.024  0.767  0.932  0.875  0.568  0.278  0.898  0.111  0.000  0.114  0.702  1.000  0.002  0.742  0.075  0.144  HK3  0.548  0.981    0.135  0.024  0.301  0.376  0.032  0.578  0.402  0.472  0.023  0.011  0.012  0.225  0.766  0.006  0.370  0.096  0.082  HK4  0.156  0.271  0.135    0.000  0.036  0.041  0.061  0.010  0.356  0.347  0.004  0.000  0.000  0.003  0.000  0.000  0.069  0.000  0.000  HL1  0.001  0.024  0.024  0.000    0.000  0.000  0.000  0.000  0.040  0.070  0.388  0.000  0.000  0.000  0.019  0.000  0.020  0.000  0.000  HL2  0.308  0.767  0.301  0.036  0.000    0.099  0.233  0.107  0.293  0.322  0.012  0.031  0.170  0.664  0.208  0.000  0.747  0.002  0.037  HF1  0.038  0.932  0.376  0.041  0.000  0.099    0.013  0.000  0.002  0.541  0.059  0.047  0.000  0.241  0.110  0.000  0.077  0.002  0.026  HF2  0.001  0.875  0.032  0.061  0.000  0.233  0.013    0.772  0.644  0.622  0.001  0.001  0.003  0.221  0.142  0.000  0.006  0.000  0.001  HF3  0.352  0.568  0.578  0.010  0.000  0.107  0.000  0.772    0.548  0.621  0.067  0.016  0.000  0.757  0.316  0.000  0.273  0.000  0.005  HF4  0.481  0.278  0.402  0.356  0.040  0.293  0.002  0.644  0.548    0.634  0.239  0.206  0.004  0.169  0.782  0.000  0.099  0.000  0.004  HF5  0.296  0.898  0.472  0.347  0.070  0.322  0.541  0.622  0.621  0.634    0.240  0.022  0.248  0.005  0.634  0.000  0.049  0.000  0.014  HF6  0.001  0.111  0.023  0.004  0.388  0.012  0.059  0.001  0.067  0.239  0.240    0.001  0.002  0.001  0.151  0.000  0.023  0.000  0.000  MK1  0.771  0.000  0.011  0.000  0.000  0.031  0.047  0.001  0.016  0.206  0.022  0.001    0.000  0.008  0.005  0.000  0.083  0.005  0.010  MK2  0.151  0.114  0.012  0.000  0.000  0.170  0.000  0.003  0.000  0.004  0.248  0.002  0.000    0.601  0.082  0.000  0.019  0.000  0.000  MK3  0.551  0.702  0.225  0.003  0.000  0.664  0.241  0.221  0.757  0.169  0.005  0.001  0.008  0.601    0.147  0.000  0.732  0.022  0.042  MK4  0.384  1.000  0.766  0.000  0.019  0.208  0.110  0.142  0.316  0.782  0.634  0.151  0.005  0.082  0.147    0.008  0.487  0.014  0.189  MK5  0.001  0.002  0.006  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.008    0.000  0.000  0.000  MsF  0.477  0.742  0.370  0.069  0.020  0.747  0.077  0.006  0.273  0.099  0.049  0.023  0.083  0.019  0.732  0.487  0.000    0.007  0.287  MsS  0.089  0.075  0.096  0.000  0.000  0.002  0.002  0.000  0.000  0.000  0.000  0.000  0.005  0.000  0.022  0.014  0.000  0.007    0.938  MsL  0.071  0.144  0.082  0.000  0.000  0.037  0.026  0.001  0.005  0.004  0.014  0.000  0.010  0.000  0.042  0.189  0.000  0.287  0.938      5  3  6  12  17  7  11  11  8  6  6  12  16  13  8  5  19  7  15  13  View Large Table IV. Significant hotspots (at the 5% significance level), after application of the χ2 heterogeneity test for each cell line subgroup (HK1–MsL) and pooled data (Total) The first column shows the nucleotides where a significant hotspot was observed in one or more spectra. Hotspots significant at the a priori probability level are represented by gray and solid blocks. Hotspots significant at the a posteriori probability level are represented by solid blocks. View Large Table V. The probabilities of significant differences for each pair-wise comparison for cell line subgroups for the types of spontaneous substitutions after application of the Adams–Skopek test Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001.The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    0.009  0.002  0.222  0.036  0.083  0.326  0.075  0.139  0.064  0.281  0.023  0.259  0.061  0.063  0.715  0.000  0.278  0.433  0.211  HK2  0.009    0.029  0.022  0.000  0.004  0.006  0.061  0.001  0.000  0.079  0.021  0.312  0.012  0.000  0.023  0.044  0.000  0.000  0.001  HK3  0.002  0.029    0.021  0.001  0.004  0.002  0.049  0.012  0.007  0.000  0.042  0.134  0.004  0.001  0.042  0.061  0.007  0.001  0.182  HK4  0.222  0.022  0.021    0.095  0.981  0.634  0.293  0.287  0.921  0.371  0.193  0.029  0.434  0.116  0.029  0.000  0.166  0.023  0.044  HL1  0.036  0.000  0.001  0.095    0.387  0.343  0.000  0.074  0.934  0.084  0.001  0.000  0.000  0.068  0.002  0.000  0.021  0.005  0.000  HL2  0.083  0.004  0.004  0.981  0.387    0.569  0.144  0.169  0.959  0.344  0.128  0.003  0.961  0.179  0.012  0.000  0.108  0.023  0.008  HF1  0.326  0.006  0.002  0.634  0.343  0.569    0.006  0.759  0.538  0.484  0.012  0.056  0.269  0.260  0.135  0.000  0.481  0.642  0.021  HF2  0.075  0.061  0.049  0.293  0.000  0.144  0.006    0.008  0.405  0.147  1.000  0.016  0.112  0.000  0.029  0.028  0.011  0.000  0.012  HF3  0.139  0.001  0.012  0.287  0.074  0.169  0.759  0.008    0.403  0.116  0.015  0.048  0.019  0.071  0.354  0.000  0.499  0.587  0.108  HF4  0.064  0.000  0.007  0.921  0.934  0.959  0.538  0.405  0.403    0.211  0.355  0.025  0.851  0.301  0.059  0.000  0.316  0.103  0.109  HF5  0.281  0.079  0.000  0.371  0.084  0.344  0.484  0.147  0.116  0.211    0.098  0.135  0.268  0.005  0.116  0.001  0.045  0.052  0.002  HF6  0.023  0.021  0.042  0.193  0.001  0.128  0.012  1.000  0.015  0.355  0.098    0.006  0.191  0.001  0.022  0.018  0.017  0.001  0.014  MK1  0.259  0.312  0.134  0.029  0.000  0.003  0.056  0.016  0.048  0.025  0.135  0.006    0.001  0.001  0.535  0.000  0.124  0.026  0.078  MK2  0.061  0.012  0.004  0.434  0.000  0.961  0.269  0.112  0.019  0.851  0.268  0.191  0.001    0.068  0.003  0.000  0.032  0.001  0.000  MK3  0.063  0.000  0.001  0.116  0.068  0.179  0.260  0.000  0.071  0.301  0.005  0.001  0.001  0.068    0.001  0.000  0.198  0.064  0.004  MK4  0.715  0.023  0.042  0.029  0.002  0.012  0.135  0.029  0.354  0.059  0.116  0.022  0.535  0.003  0.001    0.000  0.351  0.564  0.430  MK5  0.000  0.044  0.061  0.000  0.000  0.000  0.000  0.028  0.000  0.000  0.001  0.018  0.000  0.000  0.000  0.000    0.000  0.000  0.000  MsF  0.278  0.000  0.007  0.166  0.021  0.108  0.481  0.011  0.499  0.316  0.045  0.017  0.124  0.032  0.198  0.351  0.000    0.818  0.442  MsS  0.433  0.000  0.001  0.023  0.005  0.023  0.642  0.000  0.587  0.103  0.052  0.001  0.026  0.001  0.064  0.564  0.000  0.818    0.056  MsL  0.211  0.001  0.182  0.044  0.000  0.008  0.021  0.012  0.108  0.109  0.002  0.014  0.078  0.000  0.004  0.430  0.000  0.442  0.056      5  16  16  7  12  7  6  11  7  4  5  13  11  10  9  10  18  8  10  11  Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001.The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    0.009  0.002  0.222  0.036  0.083  0.326  0.075  0.139  0.064  0.281  0.023  0.259  0.061  0.063  0.715  0.000  0.278  0.433  0.211  HK2  0.009    0.029  0.022  0.000  0.004  0.006  0.061  0.001  0.000  0.079  0.021  0.312  0.012  0.000  0.023  0.044  0.000  0.000  0.001  HK3  0.002  0.029    0.021  0.001  0.004  0.002  0.049  0.012  0.007  0.000  0.042  0.134  0.004  0.001  0.042  0.061  0.007  0.001  0.182  HK4  0.222  0.022  0.021    0.095  0.981  0.634  0.293  0.287  0.921  0.371  0.193  0.029  0.434  0.116  0.029  0.000  0.166  0.023  0.044  HL1  0.036  0.000  0.001  0.095    0.387  0.343  0.000  0.074  0.934  0.084  0.001  0.000  0.000  0.068  0.002  0.000  0.021  0.005  0.000  HL2  0.083  0.004  0.004  0.981  0.387    0.569  0.144  0.169  0.959  0.344  0.128  0.003  0.961  0.179  0.012  0.000  0.108  0.023  0.008  HF1  0.326  0.006  0.002  0.634  0.343  0.569    0.006  0.759  0.538  0.484  0.012  0.056  0.269  0.260  0.135  0.000  0.481  0.642  0.021  HF2  0.075  0.061  0.049  0.293  0.000  0.144  0.006    0.008  0.405  0.147  1.000  0.016  0.112  0.000  0.029  0.028  0.011  0.000  0.012  HF3  0.139  0.001  0.012  0.287  0.074  0.169  0.759  0.008    0.403  0.116  0.015  0.048  0.019  0.071  0.354  0.000  0.499  0.587  0.108  HF4  0.064  0.000  0.007  0.921  0.934  0.959  0.538  0.405  0.403    0.211  0.355  0.025  0.851  0.301  0.059  0.000  0.316  0.103  0.109  HF5  0.281  0.079  0.000  0.371  0.084  0.344  0.484  0.147  0.116  0.211    0.098  0.135  0.268  0.005  0.116  0.001  0.045  0.052  0.002  HF6  0.023  0.021  0.042  0.193  0.001  0.128  0.012  1.000  0.015  0.355  0.098    0.006  0.191  0.001  0.022  0.018  0.017  0.001  0.014  MK1  0.259  0.312  0.134  0.029  0.000  0.003  0.056  0.016  0.048  0.025  0.135  0.006    0.001  0.001  0.535  0.000  0.124  0.026  0.078  MK2  0.061  0.012  0.004  0.434  0.000  0.961  0.269  0.112  0.019  0.851  0.268  0.191  0.001    0.068  0.003  0.000  0.032  0.001  0.000  MK3  0.063  0.000  0.001  0.116  0.068  0.179  0.260  0.000  0.071  0.301  0.005  0.001  0.001  0.068    0.001  0.000  0.198  0.064  0.004  MK4  0.715  0.023  0.042  0.029  0.002  0.012  0.135  0.029  0.354  0.059  0.116  0.022  0.535  0.003  0.001    0.000  0.351  0.564  0.430  MK5  0.000  0.044  0.061  0.000  0.000  0.000  0.000  0.028  0.000  0.000  0.001  0.018  0.000  0.000  0.000  0.000    0.000  0.000  0.000  MsF  0.278  0.000  0.007  0.166  0.021  0.108  0.481  0.011  0.499  0.316  0.045  0.017  0.124  0.032  0.198  0.351  0.000    0.818  0.442  MsS  0.433  0.000  0.001  0.023  0.005  0.023  0.642  0.000  0.587  0.103  0.052  0.001  0.026  0.001  0.064  0.564  0.000  0.818    0.056  MsL  0.211  0.001  0.182  0.044  0.000  0.008  0.021  0.012  0.108  0.109  0.002  0.014  0.078  0.000  0.004  0.430  0.000  0.442  0.056      5  16  16  7  12  7  6  11  7  4  5  13  11  10  9  10  18  8  10  11  View Large Table VI. Rank orders of dinucleotides arranged, left to right, from most mutable to least mutable after application of nearest neighbour analysis (Cooper and Krawczak, 1990)   Most mutable dinucleotide →  Least mutable dinucleotide(s)  For example, for subgroup HF4 the dinucleotide CC was most mutable relative to all other dinucleotides, followed by TC. Dinucleotides separated by / had the same dinucleotide relative mutability.  HK1  GG>  GA>TT>AG>CG>TC>  AA/AC/AT/GC/GT/CA/CC/CT/TA/TG  HK2  GA>  AG>GG>AA/CG>TC>AT>CT>CC>  AC/GC/GT/CA/TA/TG/TT  HK3  GG>  TG>GC>CT>AA>CA>CG>TC>CC>AG>GA>  AC/AT/GT/TA/TT  HK4  GA>  AG>GC>CG>GG>TG>TC>CT>CC>AC/GT>TT>CA>  AA/AT/TA  HL1  TC>  CG>CT>CC>CA>GA>GG>AG>AA>GC>  AC/AT/GT/TA/TG/TT  HL2  GG>  TG>CG>GA>AG>TC>CC>CT>GC>AC>CA>  AA/AT/GT/TA/TT  HF1  GA>  AG/GG>CG>CT>TC>CC>AC/GC>CA>  AA/AT/GT/TA/TG/TT  HF2  CG>  TC>GA>CT>AG>GT>CC>CA>GG>AA>AT>AC/GC>  TA/TG/TT  HF3  TC>  CG>CC>CA>GA>GG>AC>CT>GC/GT>AG>  AA/AT/TA/TG/TT  HF4  CC>  TC>CG>GG>CT>GA>GC>AG>  AA/AC/AT/GT/CA/TA/TG/TT  HF5  CT>  TC>GG>CG/GA>AG>TT>CC>CA>  AA/AC/AT/GC/GT/TA/TG  HF6  TC>  CG>CT>CC>GA>AC/GG/TT>CA>  AA/AG/AT/GC/GT/TA/TG  MK1  TA>  GT>CG>AA>TT>GG>TC>AT>CC>AC>CT>GA>  AG/GC/CA/TG  MK2  CG>  GG>GA>CT>TC>AG>CC>TG>CA>GT>AA>GC/TT>AC>  AT/TA  MK3  CG>  GA>GG/TG>CC>GT/TC>AG/CT>CA>GC>TT>  AA/AC/AT/TA  MK4  TT>  CG>GG>TC>CT>GT>GA>  AA/AG/AC/AT/GC/CA/CC/TA/TG  MK5  GT>  GG>CT>TC>CC>GA/CG>GC>AG>  AA/AC/AT/CA/TA/TG/TT  MsF  TG>  CG>GG>AG>TC>GA>GC>CC>AT>GT/TT>CT>AA>  AC/CA/TA  MsS  CG>  GG>GC>TG>GT>GA>AG/TA>TT>CA>AA>AC>TC>CT>  AT/CC  MsL  CG>  TG>GA>GC>GG>GT>TT>TA/AG>TC>AA>CA>CC>CT>  AT/AC  Total  CG>  GG>GA>TC>TG>CT>AG>CC>GT>GC>CA>TT>TA>AA>AC>  AT    Most mutable dinucleotide →  Least mutable dinucleotide(s)  For example, for subgroup HF4 the dinucleotide CC was most mutable relative to all other dinucleotides, followed by TC. Dinucleotides separated by / had the same dinucleotide relative mutability.  HK1  GG>  GA>TT>AG>CG>TC>  AA/AC/AT/GC/GT/CA/CC/CT/TA/TG  HK2  GA>  AG>GG>AA/CG>TC>AT>CT>CC>  AC/GC/GT/CA/TA/TG/TT  HK3  GG>  TG>GC>CT>AA>CA>CG>TC>CC>AG>GA>  AC/AT/GT/TA/TT  HK4  GA>  AG>GC>CG>GG>TG>TC>CT>CC>AC/GT>TT>CA>  AA/AT/TA  HL1  TC>  CG>CT>CC>CA>GA>GG>AG>AA>GC>  AC/AT/GT/TA/TG/TT  HL2  GG>  TG>CG>GA>AG>TC>CC>CT>GC>AC>CA>  AA/AT/GT/TA/TT  HF1  GA>  AG/GG>CG>CT>TC>CC>AC/GC>CA>  AA/AT/GT/TA/TG/TT  HF2  CG>  TC>GA>CT>AG>GT>CC>CA>GG>AA>AT>AC/GC>  TA/TG/TT  HF3  TC>  CG>CC>CA>GA>GG>AC>CT>GC/GT>AG>  AA/AT/TA/TG/TT  HF4  CC>  TC>CG>GG>CT>GA>GC>AG>  AA/AC/AT/GT/CA/TA/TG/TT  HF5  CT>  TC>GG>CG/GA>AG>TT>CC>CA>  AA/AC/AT/GC/GT/TA/TG  HF6  TC>  CG>CT>CC>GA>AC/GG/TT>CA>  AA/AG/AT/GC/GT/TA/TG  MK1  TA>  GT>CG>AA>TT>GG>TC>AT>CC>AC>CT>GA>  AG/GC/CA/TG  MK2  CG>  GG>GA>CT>TC>AG>CC>TG>CA>GT>AA>GC/TT>AC>  AT/TA  MK3  CG>  GA>GG/TG>CC>GT/TC>AG/CT>CA>GC>TT>  AA/AC/AT/TA  MK4  TT>  CG>GG>TC>CT>GT>GA>  AA/AG/AC/AT/GC/CA/CC/TA/TG  MK5  GT>  GG>CT>TC>CC>GA/CG>GC>AG>  AA/AC/AT/CA/TA/TG/TT  MsF  TG>  CG>GG>AG>TC>GA>GC>CC>AT>GT/TT>CT>AA>  AC/CA/TA  MsS  CG>  GG>GC>TG>GT>GA>AG/TA>TT>CA>AA>AC>TC>CT>  AT/CC  MsL  CG>  TG>GA>GC>GG>GT>TT>TA/AG>TC>AA>CA>CC>CT>  AT/AC  Total  CG>  GG>GA>TC>TG>CT>AG>CC>GT>GC>CA>TT>TA>AA>AC>  AT  View Large Table VII. Strand bias for mutations at cytosines (C) and guanines (G) in the NT strand for each cell line/subgroup     Observed  Expected  χ2  P  P values significant at the 5% significance level are shown in bold. A probability value of 0.000 indicates P < 0.001.  aA significant P value where there was a higher level of substitutions observed at cytosines in the NT strand.  HK1  C  0  2  4.909  0.027    G  4  2      HK2  C  2  4  1.992  0.158    G  5  3      HK3  C  6  6  0.097  0.755    G  4  4      HK4  C  13  25  12.498  0.000    G  32  20      HL1  C  91  59  42.286  0.000a    G  14  46      HL2  C  17  24  4.212  0.040    G  26  19      HF1  C  17  16  0.145  0.703    G  12  13      HF2  C  12  19  6.130  0.013    G  23  16      HF3  C  17  15  0.674  0.412    G  10  12      HF4  C  15  12  1.521  0.217    G  7  10      HF5  C  11  10  0.263  0.608    G  7  8      HF6  C  27  17  12.827  0.000a    G  4  14      MK1  C  4  4  0.012  0.914    G  3  3      MK2  C  75  116  31.907  0.000    G  135  94      MK3  C  20  30  7.806  0.005    G  35  25      MK4  C  2  3  0.461  0.497    G  3  2      MK5  C  10  17  5.746  0.017    G  20  13      MsF  C  9  15  5.966  0.015    G  19  13      MsS  C  10  20  10.864  0.001    G  26  16      MsL  C  12  20  6.896  0.009    G  24  16          Observed  Expected  χ2  P  P values significant at the 5% significance level are shown in bold. A probability value of 0.000 indicates P < 0.001.  aA significant P value where there was a higher level of substitutions observed at cytosines in the NT strand.  HK1  C  0  2  4.909  0.027    G  4  2      HK2  C  2  4  1.992  0.158    G  5  3      HK3  C  6  6  0.097  0.755    G  4  4      HK4  C  13  25  12.498  0.000    G  32  20      HL1  C  91  59  42.286  0.000a    G  14  46      HL2  C  17  24  4.212  0.040    G  26  19      HF1  C  17  16  0.145  0.703    G  12  13      HF2  C  12  19  6.130  0.013    G  23  16      HF3  C  17  15  0.674  0.412    G  10  12      HF4  C  15  12  1.521  0.217    G  7  10      HF5  C  11  10  0.263  0.608    G  7  8      HF6  C  27  17  12.827  0.000a    G  4  14      MK1  C  4  4  0.012  0.914    G  3  3      MK2  C  75  116  31.907  0.000    G  135  94      MK3  C  20  30  7.806  0.005    G  35  25      MK4  C  2  3  0.461  0.497    G  3  2      MK5  C  10  17  5.746  0.017    G  20  13      MsF  C  9  15  5.966  0.015    G  19  13      MsS  C  10  20  10.864  0.001    G  26  16      MsL  C  12  20  6.896  0.009    G  24  16      View Large Fig. 1. View largeDownload slide supF mutation spectra where the tRNA nucleotide sequence for the NT strand is represented on the x-axis (nt 99–183) and frequency of data (actual numbers) is shown on the y-axis. Each base substitution is colour coded: a nucleotide change to A is green, to G is red and to Y is blue. X can be A, C, G or T depending on the nucleotide position. (A) Pooled supF spontaneous mutation spectra. Each nucleotide position shows the number of base substitutions observed over all mutant data. The types of substitution are colour coded, for example, at C133 52 substitutions were observed: nine were C→A (green), 38 were C→G (red) and seven were C→T (blue). (B) Mutation spectrum for spontaneous substitutions after single-stranded supF was transfected into monkey kidney COS-7 cells. (C) Spectrum showing the number of individual mutation spectra contributing mutant data to each nucleotide site, for example, mutations at G123 were derived from 16 references, whereas no substitutions were recorded from any reference at A147. Fig. 1. View largeDownload slide supF mutation spectra where the tRNA nucleotide sequence for the NT strand is represented on the x-axis (nt 99–183) and frequency of data (actual numbers) is shown on the y-axis. Each base substitution is colour coded: a nucleotide change to A is green, to G is red and to Y is blue. X can be A, C, G or T depending on the nucleotide position. (A) Pooled supF spontaneous mutation spectra. Each nucleotide position shows the number of base substitutions observed over all mutant data. The types of substitution are colour coded, for example, at C133 52 substitutions were observed: nine were C→A (green), 38 were C→G (red) and seven were C→T (blue). (B) Mutation spectrum for spontaneous substitutions after single-stranded supF was transfected into monkey kidney COS-7 cells. (C) Spectrum showing the number of individual mutation spectra contributing mutant data to each nucleotide site, for example, mutations at G123 were derived from 16 references, whereas no substitutions were recorded from any reference at A147. Fig. 2. View largeDownload slide View largeDownload slide Individual spontaneous mutation spectra were derived for all sources as described in Table I. The axes and colour coding of substitutions in the spectra are described in Figure 1. Fig. 2. View largeDownload slide View largeDownload slide Individual spontaneous mutation spectra were derived for all sources as described in Table I. The axes and colour coding of substitutions in the spectra are described in Figure 1. 1 To whom correspondence should be addressed. Tel: +44 1792 205200; Fax: +44 1792 205200; Email: balewis@swan.ac.uk The studies described here were made possible by support from the Biological and Biotechnology Research Council, Otsuka Pharmaceuticals and BAT Ltd. References Adams,W.T. and Skopek,T.R. ( 1987) Statistical test for the comparison of samples from mutational spectra. J. Mol. Biol. , 194, 391–396. Google Scholar Akasaka,S., Takimoto,K. and Yamamoto,K. ( 1992) G:C→T:A and G:C→C:G transversions are the predominant spontaneous mutations in the Escherichia coli supF gene—an improved LacZ(Am) Escherichia coli host designed for assaying pZ189 supF mutational specificity. Mol. Gen. Genet. , 235, 173–178. Google Scholar Akman,S.A., Forrest,G.P., Doroshow,J.H. and Dizdaroglu,M. ( 1991) Mutation of potassium permanganate-treated and hydrogen peroxide-treated plasmid pZ189 replicating in CV-1 monkey kidney cells. Mutat. Res. , 261, 123–130. Google Scholar Ames,B.N. and Gold,L.S. ( 1991) Endogenous mutagens and the causes of aging and cancer. Mutat. Res. , 250, 3–16. Google Scholar Bigger,C.A.H., Flickinger,D.J., Strandberg,J., Pataki,J., Harvey,R.G. and Dipple,A. ( 1990) Mutational specificity of the anti-1,2-dihydrodiol 3,4-epoxide of 5-methylchrysene. Carcinogenesis , 11, 2263–2265. Google Scholar Bigger,C.A.H., Stjohn,J., Yagi,H., Jerina,D.M. and Dipple,A. ( 1992) Mutagenic specificities of 4 stereoisomeric benzo[c]phenanthrene dihydrodiol epoxides. Proc. Natl Acad. Sci. USA , 89, 368–372. Google Scholar Bigger,C.A.H., Ponten,I., Page,J.E. and Dipple,A. ( 2000) Mutational spectra for polycyclic aromatic hydrocarbons in the supF target gene. Mutat. Res. , 450, 75–93. Google Scholar Boldt,J., Mah,M.C.M., Wang,Y.C., Smith,B.A., Beland,F.A., Maher,V.M. and McCormick,J.J. ( 1991) Kinds of mutations found when a shuttle vector containing adducts of 1,6-dinitropyrene replicates in human cells. Carcinogenesis , 12, 119–126. Google Scholar Burcham,P.C. ( 1999) Internal hazards: baseline DNA damage by endogenous products of normal metabolism. Mutat. Res. , 443, 11–36. Google Scholar Cabral Neto,J.B., Gentil,A., Cabral,R.E.C. and Sarasin,A. ( 1992) Mutation spectrum of heat-induced abasic sites on a single-stranded shuttle vector replicated in mammalian cells. J. Biol. Chem. , 267, 19718–19723. Google Scholar Cabral Neto,J.B., Gentil,A., Cabral,R.E.C. and Sarasin,A. ( 1993) Implication of uracil in spontaneous mutagenesis on a single-stranded shuttle vector replicated in mammalian cells. Mutat. Res. , 288, 249–255. Google Scholar Canella,K.A. and Seidman, M.M. ( 2000) Mutation spectra in supF: approaches to elucidating sequence context effects. Mutat. Res. , 450, 61–73 Google Scholar Cariello,N.F., Piegorsch,W.W., Adams,W.T. and Skopek,T.R. ( 1994) Computer program for the analysis of mutational spectra—application to p53 mutations. Carcinogenesis , 15, 2281–2285. Google Scholar Cooper,D.N. and Krawczak,M. ( 1990) The mutational spectrum of single base-pair substitutions causing human genetic disease—patterns and predictions. Hum. Genet. , 85, 55–74. Google Scholar Dammann,R. and Pfeifer,G.P. ( 1997) Lack of gene- and strand-specific DNA repair in RNA polymerase III-transcribed human tRNA genes. Mol. Cell. Biol. , 17, 219–229. Google Scholar de Boer,J.G., Erfle,H., Holcroft,J., Walsh,D., Dycaico,M., Provost,S., Short,J. and Glickman,B.W. ( 1996) Spontaneous mutants recovered from liver and germ cell tissue of low copy number lacI transgenic rats. Mutat. Res. , 352, 73–78. Google Scholar de Boer,J.G., Provost,S., Gorelick,N., Tindall,K. and Glickman,B.W. ( 1998) Spontaneous mutation in lacI transgenic mice: a comparison of tissues. Mutagenesis , 13, 109–114. Google Scholar Douglas,G.R., Gingerich,J.D., Gossen,J.A. and Bartlett,S.A. ( 1994) Sequence spectra of spontaneous LacZ gene-mutations in transgenic mouse somatic and germline tissues. Mutagenesis , 9, 451–458. Google Scholar Eckert,K.A. and Opresko,P.L. ( 1999) DNA polymerase mutagenic bypass and proofreading of endogenous DNA lesions. Mutat. Res. , 424, 221–236. Google Scholar Friedberg, E.C., Walker, G.C. and Siede, W. (1995) DNA Repair and Mutagenesis. ASM Press, Washington, D.C. Google Scholar Fronza,G., Madzak,C., Campomenosi,P., Inga,A., Iannone,R., Abbondandolo,A. and Sarasin,A. ( 1994) Mutation spectrum of 4-nitroquinoline 1-oxide-damaged single-stranded shuttle vector DNA transfected into monkey cells. Mutat. Res. , 308, 117–125. Google Scholar Glickman,B.W., Saddi,V.A. and Curry,J. ( 1994) International Commission for Protection Against Environmental Mutagens and Carcinogens. Working paper no. 2. Spontaneous mutations in mammalian cells. Mutat. Res. , 304, 19–32. Google Scholar Harbach,P.R., Zimmer,D.M., Filipunas,A.L., Mattes,W.B. and Aaron,C.S. ( 1999) Spontaneous mutation spectrum at the lambda cll locus in liver, lung and spleen tissue of Big Blue® transgenic mice. Environ. Mol. Mutagen. , 33, 132–143. Google Scholar Hauser,J., Seidman,M.M., Sidur,K. and Dixon,K. ( 1986) Sequence specificity of point mutations induced during passage of a UV-irradiated shuttle vector plasmid in monkey cells. Mol. Cell. Biol. , 6, 277–285. Google Scholar Hauser,J., Levine,A.S. and Dixon,K. ( 1987) Unique pattern of point mutations arising after gene-transfer into mammalian cells. EMBO J. , 6, 63–67. Google Scholar Jaberaboansari,A., Dunn,W.C., Preston,R.J., Mitra,S. and Waters,L.C. ( 1991) Mutations induced by ionizing-radiation in a plasmid replicated in human cells. 2. Sequence-analysis of alpha-particle-induced point mutations. Radiat. Res. , 127, 202–210. Google Scholar Juedes,M.J. and Wogan,G.N. ( 1996) Peroxynitrite-induced mutation spectra of pSP189 following replication in bacteria and in human cells. Mutat. Res. , 349, 51–61. Google Scholar Kawanishi,M., Matsuda,T., Nakayama,A., Takebe,H., Matsui,S. and Yagi,T. ( 1998) Molecular analysis of mutations induced by acrolein in human fibroblast cells using supF shuttle vector plasmids. Mutat. Res. , 417, 65–73. Google Scholar Kraemer,K.H. and Seidman,M.M. ( 1989) Use of supF, an Escherichia coli tyrosine suppressor transfer-RNA gene as a mutagenic target in shuttle-vector plasmids. Mutat. Res. , 220, 61–72. Google Scholar Krawczak,M. and Cooper,D.N. ( 1997) The human gene mutation database. Trends Genet. , 13, 121–122. Google Scholar Khromov-Borisov,N.N., Rogozin,I.B., Henriques,J.A.P. and de Serres,F.J. ( 1999) Similarity pattern analysis in mutational distributions. Mutat. Res. , 430, 55–74. Google Scholar Leach,E.G., Narayanan,L., Havre,P.A., Gunther,E.J., Yeasky,T.M. and Glazer,P.M. ( 1996) Tissue specificity of spontaneous point mutations in lambda supF transgenic mice. Environ. Mol. Mutagen. , 28, 459–464. Google Scholar Lewis,P.D., Harvey,J.S., Waters,E.M. and Parry,J.M. ( 2000) The mammalian gene mutation database. Mutagenesis , 15, 411–414. Google Scholar Loeb,L.A. ( 1989) Endogenous carcinogenesis: molecular oncology into the twenty-first century—Presidential Address. Cancer Res. , 49, 5489–5496. Google Scholar Madzak,C., Cabralneto,J.B., Menck,C.F.M. and Sarasin,A. ( 1992) Spontaneous and ultraviolet-induced mutations on a single-stranded shuttle vector transfected into monkey cells. Mutat. Res. , 274, 135–145. Google Scholar Moraes,E.C., Keyse,S.M., Pidoux,M. and Tyrrell,R.M. ( 1989) The spectrum of mutations generated by passage of a hydrogen-peroxide damaged shuttle vector plasmid through a mammalian host. Nucleic Acids Res. , 17, 8301–8312. Google Scholar Murata-kamiya,N., Kamiya,H., Kaji,H. and Kasai,H. ( 1997) Glyoxal, a major product of DNA oxidation, induces mutations at G:C sites on a shuttle vector plasmid replicated in mammalian cells. Nucleic Acids Res. , 25, 1897–1902. Google Scholar Myrand,S.P., Topping,R.S. and States,J.C. ( 1996) Stable transformation of xeroderma pigmentosum group A cells with an XPA minigene restores normal DNA repair and mutagenesis of UV-treated plasmids. Carcinogenesis , 17, 1909–1917. Google Scholar Ono,T., Ikehata,H., Nakamura,S., Saito,Y., Komura,J., Hosoi,Y. and Yamamoto,K. ( 1999) Molecular nature of mutations induced by a high dose of X-rays in spleen, liver and brain of the lacZ-transgenic mouse. Environ. Mol. Mutagen. , 34, 97–105. Google Scholar Sarkar,S., Dasgupta,U.B. and Summers,W.C. ( 1984) Error-prone mutagenesis detected in mammalian cells by a shuttle vector containing the supF gene of Escherichia coli. Mol. Cell. Biol. , 4, 2227–2230. Google Scholar Seidman,M.M., Dixon,K., Razzaque,A., Zagursky,R.J. and Berman,M.L. ( 1985) A shuttle vector plasmid for studying carcinogen-induced point mutations in mammalian cells. Gene , 38, 233–237. Google Scholar Seidman,M.M., Bredberg,A., Seetharam,S. and Kraemer,K.H. ( 1987) Multiple point mutations in a shuttle vector propagated in human cells—evidence for an error-prone DNA-polymerase activity. Proc. Natl Acad. Sci. USA , 84, 4944–4948. Google Scholar Sikpi,M.O., Freedman,M.L., Ziobron,E.R., Upholt,W.B. and Lurie,A.G. ( 1991) Dependence of the mutation spectrum in a shuttle plasmid replicated in human lymphoblasts on dose of gamma-radiation. Int. J. Radiat. Biol. , 59, 1115–1126. Google Scholar Smith,K.C. ( 1992) Spontaneous mutagenesis—experimental, genetic and other factors. Mutat. Res. , 277, 139–162. Google Scholar Tarone,R.E. ( 1989) Testing for non-randomness of events in sparse data situations. Ann. Hum. Genet , 53, 381–387. Google Scholar Vineis,P., Malats,N., Porta,M. and Real,F.X. ( 1999) Human cancer, carcinogenic exposures and mutation spectra. Mutat. Res. , 436, 185–194. Google Scholar Yang,J.L., Maher,V.M. and McCormick,J.J. ( 1987) Kinds of mutations formed when a shuttle vector containing adducts of (+/–)-7-beta,8-alpha-dihydroxy-9-alpha,10-alpha-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene replicates in human cells. Proc. Natl Acad. Sci. USA , 84, 3787–3791. Google Scholar © UK Environmental Mutagen Society/Oxford University Press 2001 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Mutagenesis Oxford University Press

Spontaneous mutation spectra in supF: comparative analysis of mammalian cell line base substitution spectra

Loading next page...
 
/lp/oxford-university-press/spontaneous-mutation-spectra-in-supf-comparative-analysis-of-mammalian-C3lpFk9xZ0

References (48)

Publisher
Oxford University Press
Copyright
© UK Environmental Mutagen Society/Oxford University Press 2001
ISSN
0267-8357
eISSN
1464-3804
DOI
10.1093/mutage/16.6.503
Publisher site
See Article on Publisher Site

Abstract

Abstract The last decade has seen a dramatic accumulation of mutation data from reporter genes utilized in mutagenesis experiments involving DNA reactive agents allowing comparisons for the mutagenic potential between many different mutagens. When analysing chemically induced mutation spectra it is important to establish the potential spontaneous background before drawing conclusions concerning specific chemically induced hotspots. A major mutation reporter system gene used in mammalian cells is the supF suppressor tRNA gene. The Mammalian Gene Mutation Database (MGMD) contains a considerable number of supF spontaneous mutations permitting a thorough analysis of spontaneous mutations in mammalian cell lines from different species and tissues. Analyses of spontaneous mutation spectra were performed using a range of statistical techniques. Spontaneous mutations were observed at 82.4% of the nucleotides in the supF suppressor tRNA sequence although the pattern of significant hotspots differed between cell lines. Our analyses of spontaneous mutation spectra show considerable variation both within and between cell lines for the distributions of spontaneous mutations occurring with no clear tissue or species-specific patterns emerging. In addition, spectra derived from supF recovered from liver and skin of transgenic mice, were similar to each other, but showed significant differences from many in vitro spectra. The most common base substitutions were G:C>TA transversions and G:C>A:T transitions, although levels of each type differed between cell lines. There was also variation between cell lines for the most mutable dinucleotides, however, significant hotspots were frequently observed at CpG sites and sequences containing GG/CC. We conclude that the number of varying distributions and potential hotspots for spontaneous mutations should thus be considered when comparing chemically induced mutation spectra in supF. The spectra presented here will be a useful reference for analysis and re-analysis of chemically induced spectra as well as for use in comparison with the spontaneous spectra of other gene systems. Introduction In order to gain insight into the underlying molecular events that lead to mutagenesis both in vitro and in vivo model systems have been developed to detect mutation events. Much mutant data is available for many environmental mutagens, allowing the generation of mutational spectra highlighting the mutation hotspots and coldspots in a target gene. Analysis of the specificity of a particular mutagen allows assumptions to be made as to the types of genetic lesions that a compound induces. Such assumptions can then be extrapolated to `cancer genes' and assessments made for carcinogenicity hazard and risk, although this area remains controversial (Vineis et al., 1999). Although much attention has been given to the potential of exogenous agents to produce mutations there has been little focus on the patterns of spontaneous events. The term `spontaneous' has been used rather loosely, but has generally been used to describe mutations that have arisen in the absence of any specified treatment (Glickman et al., 1994). The fact that all types of mutation can arise spontaneously and have presumed roles in carcinogenesis (Loeb, 1989) and ageing (Ames and Gold, 1991) highlights the importance of spontaneous mutagenesis. Furthermore, when analysing the mutagenic effects of exogenous compounds it makes sense to have as much understanding as possible of the spontaneous `background' that may contribute to the overall mutation spectrum. It is thought a combination of factors gives rise to spontaneous mutations, including errors in DNA replication, recombination and repair (Friedberg et al., 1995). DNA damage can also result from the intrinsic instability of DNA, such as deamination, as well as endogenous products of metabolism, including oxidants, alkylating agents, reactive nitrogen species, lipid peroxidation products and other intermediates of various metabolic pathways (Smith, 1992; Burcham, 1999). It has also been established that variations in experimental procedures used in model systems can affect the yields and types of spontaneous mutations observed (Smith, 1992). The Escherichia coli gene supF encodes a tyrosine amber suppressor tRNA molecule which has been adapted for the study of mutagenesis in several shuttle-vector plasmids (reviewed by Kraemer and Seidman, 1989). supF was first described as a shuttle-vector mutagenesis marker in the plasmids p3AC (Sarkar et al., 1984) and pZ189 (Seidman et al., 1985). These plasmids and their derivatives have been used extensively for assessing mutagenesis in mammalian cells. The transient shuttle-vectors can be treated with a DNA-damaging agent before or after transfection into mammalian cells. Active supF allows for the incorporation of tyrosine at the amber chain termination codons (UAG). After isolation from mammalian cells the most common assay for identifying supF mutants involves introduction of the shuttle-vector into a bacterial indicator strain carrying an amber mutation in the lacZ gene, which codes for β-galactosidase (Kraemer and Seidman, 1989). In this screening system functional supF allows read-through of the stop codon, resulting in blue colonies, whereas colonies with inactivated or partially active supF are white or light blue (Kraemer and Seidman, 1989). One of the main advantages of supF in shuttle-vector plasmids is the gene's extreme sensitivity to mutagenic inactivation. The repressor tRNA region is only 85 nt in length and a review by Canella and Seidman (2000) showed that base substitutions have been observed in all of these nucleotides. In addition, all six possible base substitution mutations can be monitored at all but 10 sites, allowing the detection of all three base substitutions that can occur at each base pair. Therefore, the selection bias associated with protein coding mutagenesis marker genes is largely avoided. The Mammalian Gene Mutation Database (MGMD) contains ~16 000 entries for supF DNA sequence information for mutants that have arisen in mammalian cell lines, of which nearly 900 are spontaneous base substitution data. Mutation data in MGMD can be rapidly accessed (http://www.listnweb.swan.ac.uk/cmgt/index.htm) and retrieved for comparison and analysis (Lewis et al., 2000). supF spontaneous mutation data is available for DNA repair-deficient human cell lines and DNA repair-proficient human, monkey and mouse cell lines, as well as for transgenic mice. The accumulation of such data permits the generation of spontaneous mutation spectra and profiles for supF in individual mammalian cell types. The identification of differences in the distributions and frequencies of spontaneous mutations in supF transfected into different cell types is a prerequisite when analysing and comparing the mutation spectra of exogenous compounds obtained using the same system. Potential spontaneous mutation differences between cell types could indicate underlying variations in methylation, cell metabolism and DNA repair capacity. Most importantly, when sufficient data is available the generation of spontaneous mutation spectra yields an estimate of the `background' that should be considered for each cell type before drawing conclusions from induced mutation spectra produced by mutagens and carcinogens. This report presents a comprehensive survey and analysis of spontaneous mutations recorded for the supF gene in mammalian cell lines, providing a reference and baseline for the study and comparison of induced mutation spectra of exogenous compounds in this gene. Materials and methods The mammalian gene mutation database The data presented in this paper was retrieved from the MGMD (Lewis et al., 2000). The database contains DNA sequence information on over 39 000 mutants from gene mutation assays involving the tyrosine suppressor tRNA (supF) gene, the hypoxanthine guanine phosphoribosyltransferase (hprt) gene, the adenine phosphoribosyltransferase (aprt) gene, the xanthine (guanine) phosphoribosyltransferase (gpt) gene and the dihydrofolate reductase (dhfr) gene. Mutants are classified as single base pair substitutions, tandem base pair substitutions (two adjacent base pair substitutions), multiple base pair substitutions (two non-adjacent base pair substitutions), deletions, insertions, duplications, rearrangements and complex (e.g. a base pair substitution and deletion in the same mutant) events. Currently, there are ~16 000 mutant entries for supF, of which over 1700 are spontaneous data. All the mutants analysed were assumed to be of independent origin. The supF gene consists of a 35 bp promoter, a 40 bp pre-tRNA region (having a function in RNA processing), an 85 bp tRNA sequence and a short 17 bp 3′-flanking sequence (Kraemer and Seidman, 1989). Cell types Spontaneous supF mutation data in MGMD were available for human (Ad293, GM606, GM0637, W138-VA13, XP12BE-SV, XAN1 and 2-0-A2), monkey (CV-1 and COS-7) and mouse (LN12) transformed cell lines (Table I). The data analysed also included supF base substitutions that had arisen in transgenic mice strains. Mutation spectra The majority of the supF mutants in the literature are recorded for the non-transcribed (NT) strand. Due to the structure of the supF gene and the alterations made to supF shuttle-vectors in the pre-tRNA region only mutations that arose in the suppressor tRNA region (nt 99–183) have been analysed in this study. The numbering system for the supF NT strand used here is that of Seidman et al. (1985). Mutation data can be copied directly from MGMD to a spreadsheet for simple generation of mutation spectra. Comparison of mutation spectra The hypergeometric test was used to test for significant differences in the distribution of mutations between spectra following the method described by Adams and Skopek (1987). The algorithm is available in a free user-friendly computer program (Cariello et al., 1994). Using this test, the probability of the observed results for a table of M spectra and N mutable sites is calculated. Monte Carlo simulation is then used to generate 10 000 random tables fixing the marginal totals for each to those observed in the experimental data. The proportion of random tables giving probabilities for the simulated results that are equal to or less than that for the observed results is taken as the probability of the observed results under the null hypothesis of no association in the observed table. The identification of hotspots within individual spectra was based on the approach described by Tarone (1989). The expected probability of a mutation at the site is calculated as the total number of mutations in the spectrum divided by the number of sites. The binomial distribution is used to calculate the a priori probability of the observed or a greater number of mutations at a given site. Monte Carlo simulation is then used to generate 10 000 random spectra using the same expected probabilities. The proportion of these spectra having at least one site significant at this a priori level is then the a posteriori significance level for that site. Obviously, the a priori probability value is numerically higher than the a posteriori level for a site and sites that are significant a priori are not always significant a posteriori. The a posteriori value takes account of the fact that even with random data lacking true hotspots some sites out of the many in a spectrum will generate significant a priori probability values just by chance. Thus use of the a posteriori values reduces the likelihood of a Type I, false positive error. FORTRAN programs for carrying out these tests are available on request from the authors. Comparison of mutation types The hypergeometric test was further used to compare base substitution frequencies between spectra. Dinucleotide relative mutability The likelihood of mutation within specific dinucleotides for each cell type was calculated as described in detail by Cooper and Krawczak (1990). Sixteen dinucleotides are possible in a DNA sequence and the frequency of each can be calculated for a given nucleotide length. For each dinucleotide the expected number of substitutions is equal to: dinucleotide frequency×total number of substitutions in all dinucleotides. One can then compare the observed versus the expected mutations for each dinucleotide. Furthermore, the mutability of each dinucleotide can be calculated relative to the least mutable dinucleotide (i.e. that presenting the least number of substitutions) by the equation: drm(d) = [O(d)˙E(least mutable d)˙E(d)] where drm(d) is the dinucleotide relative mutability, O(d) is the observed dinucleotide frequency and E(d) is the expected dinucleotide frequency. Dinucleotide relative mutabilities can then be ordered according to rank from the most mutable to the least mutable (Cooper and Krawczak, 1990), for example CG > TC > GG > ... AA > TA. Strand bias for base substitutions The statistical significance of differences between observed and expected base substitutions in the transcribed and NT strand was determined by χ2 analysis where:   \[{\chi}^{2}\ =\ {\sum}\frac{(obs\ {\mbox{--}}\ exp)^{2}}{exp}\] (with 1 degree of freedom). Results Comparison of mutation spectra A total of 837 supF tRNA sequence spontaneous base substitutions were found in MGMD that were derived from 20 individual publications (Table I). Initially, the mutation data from all publications were pooled to obtain a total mutation spectrum and to provide an overall insight into the spontaneously mutable regions of supF after transfection into mammalian cells (Figure 1A). Spontaneous mutations can be seen to be non-randomly distributed along the sequence, the majority of substitutions occurring at cytosines or guanines. The most mutable regions involved guanine tracts (G102–G105 and G122–G124), cytosine tracts (C108–C110) and CpG sites, particularly when found within a TCGA sequence (148–151, 154–157 and 162–165), where the level of mutations at the cytosine was increased relative to other CpG sites. Twenty-three nucleotides were significant mutation hotspots (at the 5% significance level): G102, G103, G104, C108, C109, G111, G115, G122, G123, G129, C133, C139, C146, C149, G150, C155, G156, G159, G160, C163, G164, C168 and C169. Sites showing no detectable mutations were A112, A119, A121, A128, T132, T145, A147, T148, A158, T170, T171, C178, C181 and A183. Table II shows that the most predominant type of spontaneous substitution was G:C→A:T (47.9%), followed by G:C→T:A (31.2%) and G:C→C:G (14.9%), although the overall number of transitions and transversions was similar (49.2 and 50.8%, respectively). The most frequent mutation hotspot, C133, falls within the supF anticodon sequence where C→G and C→A transversions change the sequence to code for tyrosine. Surprisingly, C→T transitions were also observed, even though this mutation still results in a stop codon, and all were single substitutions. The second most frequent hotspot, G123, was also the major hotspot in a spectrum (Figure 1B) derived for substitutions observed after single-stranded (transcribed strand) supF was transfected into the monkey kidney COS7 cell line (Cabral Neto et al., 1992, 1993; Madzak et al., 1992; Fronza et al., 1994). Virtually all substitutions observed in single-stranded supF were at cytosines. Three other major spontaneous hotspots, G156, G159 and G164, were also observed to be highly mutable in the single-stranded spectrum. Other mutable regions in the overall spontaneous spectrum were not detected in the single-stranded spectrum. In order to establish the heterogeneity of data derived from different sources and cell lines, the contribution of each publication to the total number of substitutions at each site was investigated (Figure 1C). There is variation between mutated nucleotides as to the number of individual spectra contributing to the level of mutations. For example, 16 individual spectra showed a substitution at G123 (one of the significant hotspots), whereas only one individual spectrum had a substitution at T107. In fact, there was strong positive correlation between the number of mutations per site and the number of individual spectra showing a mutation (r = 0.943, P < 0.0001). This resulted in a very similar (P = 0.991) distribution pattern for the spectra shown in Figure 1A and C. This could be expected given the different probability of mutation at each site and the variation in numbers of mutations in individual spectra (Table I). However, it was unknown whether the pooled mutation distribution was representative of all individual spectra or was comprised of distributions specific to cell lines, tissues or species. If the pooled spectrum was homogeneous then it should represent the true spontaneous supF mutational spectrum. Therefore, each individual spectrum was constructed for comparison (Figure 2). First, the spectra were compared by eye for similarities and differences. It was immediately apparent that some mutable sites were peculiar to a number of spectra, but not necessarily all. These sites tended to fall in the mutable regions described above for the overall spontaneous spectrum. However, many spectral comparisons appeared to reveal different mutation patterns and, site-for-site, no two spectra shared exactly the same distribution. Application of the Adams–Skopek test, for comparing the mutation distributions in two mutation spectra, confirmed that many pair-wise spectra comparisons (Table III) were significantly different (100/190). Table III reveals that no two spectra are the same for the pattern of how they differ from the other spectra. For example, although HF3 and HF4 have a statistically similar distribution (P = 0.548), HF3 and MK1 were significantly different, whereas HF4 and MK1 were not. The rank order of spectra showing the greatest number of significant differences to the least (when compared with all other spectra) was: (the numbers above each cell line indicate the number of times that cell line spectrum was significantly different in a pair-wise comparison). Spectra with a low number of mutations generally showed few significant differences with other spectra, however, there was no correlation between the number of mutations per spectrum and the number of significant pair-wise comparisons per spectrum (r = 0.158, P = 0.082). MK5 (COS-7) was significantly different to all other spectra, whereas the other COS-7 spectrum (MK4) was significantly different to only five others. Interestingly, there was no true underlying pattern that could be established for spectra either within or between different cell lines. Comparisons between human kidney cell line Ad293 (HK1–HK4) were all non-significant, but three of these spectra contained 13 or less mutations. Indeed, the probability value for the comparison between HK1 and HK2 is equal to 1.000, yet neither spectrum shares a common mutated site. This problem partially arises because all the mutated sites in both spectra have just one mutation and this is unavoidable when comparing spectra containing only a few mutations. The xeroderma pigmentosum (XP) derived fibroblast cell line spectra (HF3–HF5) also showed non-significant results when compared with each other. All other `within-cell line' comparisons showed at least one pair of spectra to be significantly different. The spectra derived from supF rescued from the liver and skin of transgenic mice were also very different from the remaining in vitro derived spectra, although both were highly similar to each other (P = 0.938). Importantly, there seemed to be no clear relationship between mutation distribution and the plasmid carrying the supF gene or the bacterial host used for mutant selection. In an effort to try and establish the basis for the lack of an underlying cell line pattern and why some spectra were so different to the rest (such as MK5 and HL1), the significant hotspots (P < 0.05) were determined for each individual spectrum (Table IV). This would allow the determination of potential randomly mutated sites from sites that could be considered true spontaneous hotspots. At the a priori probability level 27 hotspots (71.1%) were shared by at least two spectra and 11 hotspots (28.9%) were unique to individual spectra, although these were not necessarily rare mutation sites in other spectra. In addition, 12 hotspots that were significant in individual spectra were not significant after the data was pooled. Hotspot sites in all spectra were also mutable sites in at least one other spectrum. The MK5 spectrum contained one unique hotspot (G160) that comprised of 48.4% of the MK5 substitutions. HL1 was similar to only two other spectra, HF5 and HF6. HL1 and HF5 both had in common C133 as the most mutable hotspot, with C108 also being a major hotspot in both cases. HL1 and HF6, on the other hand, shared four significant hotspots that amounted to 46.5 and 46.9% of the mutations, respectively, for each spectrum. Two of the three CV-1 cell line spectra, MK2 and MK3, had a similar hotspot pattern. All eight MK3 hotspots were also hotspots for MK2 (which contained nearly four times as many mutations) including three, G103, G104 and G111, that were unique to that cell line, the latter being the most prominent hotspot for these two spectra but rarely mutated in others. The two transgenic mouse spectra MsS and MsL also differed considerably from most other spectra and showed some interesting patterns unique to these spectra. The CpG nucleotides at C114 and G115 were both significant hotspots in each spectrum, whereas they were only mutated at a low level, if at all, in all the other spectra. The most common hotspot, C133, found in 13 spectra and a significant hotspot in eight spectra had no mutations present in the transgenic mouse MsS and MsL spectra. The likelihood of a site erroneously being considered a true mutational hotspot is reduced by considering the number of significant hotspots at the a posteriori probability level. Table IV also shows a considerable reduction in the number of sites significant within each cell line at the a posteriori level. After this reduction only three hotspots, G123, C155 and G156, occur over all species, albeit not in all the cell lines/tissues of all species. Other groups of hotspots, G108 (also in MsF), G109, G129, C133 and C139, are apparent for certain human cell lines but not for the monkey cell lines. One significant hotspot (G159) was shared between monkey and mouse. Comparisons of mutation types A more consistent pattern emerged for the types of substitution occurring within each spectrum (Table II). G:C→A:T transitions were the most frequent mutation type in 70% (14/20) of spectra. Nine of these spectra (HK3, HL1, HF1, HF3, MK2, MK3, MK5, MsF and MsS) had a significantly higher (P < 0.05) level of G:C→A:T substitutions relative to other types of change. Conversely, G:C→T:A was predominant in HK3, HF2, HF6, MK5 and MsL (where HK3 and MK5 had significantly higher G:C→T:A relative to other substitution types), whereas G:C→C:G was most frequent in HK2. MK1 had the same percentage (23.1%) of G:C→A:T, G:C→T:A and A:T→T:A substitutions, but these were made up of only 13 mutations, suggesting that these frequencies may have occurred by chance. The majority of G:C→T:A transversions were distributed over GG/CC, TC/GA and CG dinucleotides and were not especially common to any particular hotspots. The frequencies of transitions and transversion for all spectra were compared pair-wise using the Adams–Skopek test (Table V). Predictably, in general those spectra with G:C→T:A and G:C→C:G mutations predominating were the most different compared with the remaining spectra. However, for those spectra where G:C→A:T transitions were most common an underlying pattern was apparent. For pair-wise comparisons between human cell line spectra only 7.1% (2/28) were significantly different. However, human and monkey pair-wise comparisons had a much higher number of differences, 43.8% (14/32), whereas 66.7% (4/6) of all within-monkey comparisons were significantly different. Also, 71.4% (5/7) of monkey and mouse comparisons and 33.3% (8/24) of human and mouse comparisons were significantly different. In mice there were no significant differences between any of the pair-wise comparisons. Importantly, these findings show that the levels of all kinds of substitutions present in human cell line spectra are homogeneous when G:C→A:T substitution predominates. The same is true for transgenic mouse spectra, but this is not the case for monkey spectra, which showed greater variation. Figure 1A reveals that GG/CC and GA/CT are the most frequently mutated dinucleotides, closely followed by CpG sites. Mutations at CpG sites make up ~25% of the total number. However, CpG dinucleotides collectively make up 43% of the total sequence. Therefore, nearest neighbour analysis (Cooper and Krawkzak, 1990) was performed on pooled and individual spectra to identify any underlying pattern as to the sequences where spontaneous mutations occur in supF. This approach takes into account the frequencies of all 16 possible dinucleotides within the sequence as well as the numbers of substitutions that occur and allows one to build rank orders of the most mutable compared with the least mutable dinucleotides for both individual and pooled spectra. The pooled and individual dinucleotide rank orders (Table VI) reveal variation between spectra as to where substitutions commonly occur. The most mutable dinucleotides in human cell lines were GA/TC (50.0%), GG/CC (33.3%), CpG (8.3%) and CT/AG (8.3%), in monkey cell lines they were CpG (60.0%), TA/AT (20.0%) and GT/AC (20.0%, due to the unusual G160 hotspot in MK5) and in mouse were CpG (66.7%) and TG/CA (33.3%). These dinucleotides (with the exception of TA) tend to always be found high up the rankings in most individual cases and explain the pattern observed in the pooled rank order. The least mutable dinucleotides over all spectra involved adjacent adenines and thymines, particularly AT, TA and AA/TT. The mutability of cytosine and guanine in CpG dinucleotides differed within the spectra. The level of cytosine and guanine mutations at all six CpG sites ranged between 3 and 39% over all spectra, with the majority of these mutations falling within the three TCpGA sites. The cytosines and guanines in GA/TC sites tended to be more spread over the sequence, probably reflecting the higher frequency and distribution of this dinucleotide pairing. The level of cytosine and guanine mutations in all three TCpGA sequences, given the general high mutability of GA/TC and CpG sites, suggests that these dinucleotides act synergistically, increasing the probability of mutation. It is also noteworthy that the third most mutable CpG site (C110–G111), which is also highly mutable in MK2 and MK3, lies within a CGA/TCG sequence. GG/CC substitutions were generally observed in guanine tracts (G102–G105 and G122–G124) and a cytosine tract (C108–C110). Surprisingly, very few GG/CC mutations were found in the GG/CC-rich region between nt 168 and 182. Strand bias Analysis of the individual spectra in Figure 2 reveals different levels of mutations at cytosines and guanines along the NT strand. For each individual spectrum cytosine and guanine frequencies were compared for strand bias (Table VII). Eight spectra show no strand bias, i.e. similar numbers of mutable cytosines and guanines to that expected. Four of these (HK2, HK3, MK1 and MK4) have less than 11 mutable cytosines and guanines, suggesting that the non-significant result could be due to chance. The other four spectra, HF1 and the three XP derived spectra HF3, HF4 and HF5, all have a higher number of mutated cytosines and guanines yet show no strand bias, indicating that within these cell lines there could be no preferential repair of either strand. The remaining 12 spectra all show a significant strand bias, of which 10 show a significant excess of substitutions at guanines. The other two spectra, HL1 and HF6, have an excess of substitutions at cytosines in the NT strand and show an exceptionally high frequency of substitutions at TC dinucleotides (see Table VI). Discussion The last 15 years have produced a multitude of publications describing mutations that have arisen in the supF tRNA sequence after treatment with many different mutagens and subsequent replication in mammalian cells. Many of these publications also describe mutations, mainly base substitutions, assumed to have arisen spontaneously in control experiments. The extensive collection of these mutations in the MGMD (Lewis et al., 2000) has allowed, for the first time, a survey of the distribution and types of spontaneous base substitutions found in supF when cultured in mammalian cell lines. One of the principal reasons for carrying out this present study was to identify potential differences in the patterns of spontaneous mutations in cell lines derived from different tissues and species. Identification of tissue-specific differences for spontaneous mutations would be of critical value when assessing the pattern of chemically induced mutations in different cell types. It would also aid in deciding whether pooling of data derived from different sources is legitimate. Previous studies have shown tissue-specific differences for the distribution and types of spontaneous mutation occurring in transgenic mouse. Harbach et al. (1999) found no differences between spontaneous spectra and hotspots for the cII locus in liver, lung and spleen, but G:C→A:T transversions were more abundant in liver and lung, whereas spleen showed a higher level of G:C→T:A. Ono et al. (1999) found G:C→A:T to be the most frequent substitution in the lacZ gene recovered from mouse liver, spleen and brain. The same results were obtained by de Boer et al. (1996) for the lacI gene in kidney, stomach and liver. A more extensive study by de Boer et al. (1998) of spontaneous substitutions of lacI in transgenic mice revealed G:C→A:T to be most frequent in lung, spleen, bladder, skin, stomach, kidney and liver. However, they also found that G:C→A:T occurred at equal frequency to G:C→T:A in bone marrow. Douglas et al. (1994) also showed G:C→A:T to be the most abundant substitution type in the lacZ gene in mouse liver. Taking all data from this present study into account, however, there were no apparent tissue-specific patterns or differences for spontaneous supF base substitutions in cultured mammalian cells. The overall conclusion to be drawn from this set of mutation data is that there is considerable variation between cell lines in the actual spontaneous mutation distribution that can arise within the 85 bp supF tRNA sequence. To some extent this was enhanced by the variation in numbers of mutations contributing to each spectrum. When comparing the mutation distribution alone there was no homogeneity within cell lines, with many different hotspots apparent. The transgenic mouse liver and skin distributions were both highly different to the cell line spectra (Table III), suggesting that the mechanisms of spontaneous mutagenesis, including cause and/or processing of certain adducts, could be different to that found in cultured mammalian cells, at least for base substitutions. However, mutable regions did stand out over nearly all spectra, particularly those involving GG/CC and TCpGA, showing the higher spontaneous mutability of these sequences. Although there was much variation in the different supF mutation distributions, there were evident patterns in the types of substitutions occurring in the cell lines and the sequences being targeted. G:C→A:T transitions predominated in over half the individual spectra, followed by G:C→T:A transversions, which were also the most frequent in the remaining spectra (except HK2). Akasaka et al. (1992) also found G:C→T:A to be the most frequent spontaneous substitution in E.coli transformed with supF. Also, a pattern existed as to the sequences where spontaneous mutations arose in cell lines originating from different species, with GA/TC and GG/CC being more mutable in human than monkey and mouse cells, where CpG sites were most mutable. There was also a pattern in strand bias for guanine mutations in the NT strand of 12 of the spectra, but this was not tissue specific. However, the fact that two spectra showed a NT strand bias for cytosine mutations and eight spectra showed no strand bias for mutations at either nucleotide suggests that differences can exist both within and between cell lines as to the causes, processing and repair of adducts that result in spontaneous mutations. Human tRNA genes are transcribed by RNA polymerase III and previous research shows that these genes are not associated with transcription-coupled repair (TCR), the primary source of a strand bias for mutations (Dammann and Pfeifer, 1997). In addition, the supF gene is not transcribed in mammalian cells, therefore, a surplus of mutations at guanines or cytosines in one strand observed in the supF data could result from different types or levels of endogenous factors. Endogenous factors that ultimately cause spontaneous mutations can be many and combinations of these factors in different cell lines in different laboratories could also lead to the variation observed for the supF spectra. Endogenous factors with direct or indirect DNA-damaging potential include oxidants, lipid peroxidation products, alkylating agents, oestrogens, chlorinating agents, reactive nitrogen species and certain intermediates of various metabolic pathways (Burcham, 1999). Added to this list could be chemical or physical mutagens to which the supF plasmid could be exposed within a laboratory. Oxidants probably have the greatest potential for causing spontaneous mutations (reviewed by Burcham, 1999) and could also cause variation in where and what types of base substitutions occur along the mutation spectrum. For example, products of cytosine oxidation are likely precursors of G:C→A:T transitions, which are abundant in the supF spectra described here. This would also explain the high number of spontaneous mutations at cytosines observed when single-stranded supF was transfected into a monkey cell line (Cabral Neto et al., 1992, 1993; Madzak et al., 1992; Fronza et al., 1994). Oxidation of guanine can result in G:C→T:A transversions, which were also abundant in the supF spectra. DNA damage by oxidants could be further enhanced by a compromised level of antioxidants within a cell line. Spontaneous deamination of cytosine to uracil would explain the existence of hotspots at CpG sites. It is also interesting that deamination of cytosine to uracil is 250-fold more frequent in single-stranded than double-stranded DNA (Madzak et al., 1992). The nature of the sequence of the supF tRNA gene suggests that there is great potential for hairpin loops to form, leaving certain nucleotides in single-stranded regions where deamination of cytosine would lead to G:C→A:T transitions. In addition, the efficiency of trans-lesion synthesis and 3′→5′ exonucleolytic proofreading of endogenous lesions depends upon the specific DNA polymerase that processes groups of lesions (Eckert and Opresko, 1999). Thus, spontaneous mutation spectra will ultimately vary given the extent and interplay of the different factors that cause spontaneous mutations. Smith (1992), in his review of spontaneous mutagenesis, stated that `it is not widely appreciated that the types and frequencies of spontaneous mutations change markedly with subtle changes in experimental conditions'. There is always a small possibility that mutations could arise spontaneously in supF in the bacterial host used for selecting mutants. However, Hauser et al. (1986) showed, through a series of duplicate experiments involving transformation of E.coli MBM7070 with pZ189 vector after recovery from mammalian cells, that the mutations all arose in mammalian cells and not the bacteria. However, we observed that the spectrum for MK5 (monkey COS7 cells), which was highly significantly different from all other spectra in this study, was similar to the spontaneous spectrum for supF after transformation of E.coli KS40 cells (Akasaka et al., 1992) without prior transfection of mammalian cells. This was particularly the case for the hotspot at G160. In fact, comparison of the two spectra by the Adams–Skopek test revealed a P value of 0.03, which although significant was a much higher value than those observed in all other comparisons involving MK5. It is interesting that both studies utilized the KS40 strain (a derivative of MBM7070) and might suggest that the mutations at G160 of MK5 arose within the bacteria. This supports the argument against pooling spontaneous mutation data. Whether or not it is legitimate to pool any of the spontaneous mutation data is still debatable. Pooling of within-tissue or within-species data would result in spectra biased by the individual spectrum contributing most of the mutations. The results of spectra comparisons in this study using the Adams–Skopek test argue against pooling spectra. A useful additional approach to analysing mutation spectra would be a system where spectra comparisons result in a similarity score. For example, two identical spectra would show a score of, say, 1, whereas two unrelated spectra would have a score near to 0. This would allow groups of spectra to be analysed in a multi-dimensional manner revealing the potential inter-relationships between spectra within and between tissues and species. An alternative approach recently described involves collapsing similar profiles into homogeneous clusters to reveal their similarity patterns (Khromov-Borisov et al., 1999). The spontaneous spectra described here can be used as a major reference source in a number of ways. Primarily, the individual spectra should be used for comparison of supF spectra generated after treatment with chemical and physical agents. This will assist in distinguishing mutagen-prone hotspots from potential spontaneous background mutations. A clear example involves the comparison of the spontaneous spectra shown here and the supF mutation spectra generated after treatment with the four stereoisomers of benzo[c]phenanthrene 3,4-dihydrodiol 1,2-epoxide (B[c]PhDE) (Bigger et al., 2000). Analysis of the four chemically induced mutation spectra show C133 to be a major hotspot, yet this is also the most frequent spontaneously mutated site. Further analysis, however, reveals that after B[c]PhDE treatment the predominent type of substitution at C133 is G:C→T:A, whereas G:C→C:G is the most frequent spontaneous mutation at this site. Conversely, A112 is a major hotspot for three of the stereoisomers, yet no spontaneous mutation has been observed at this site, showing the specificity of the stereoisomers for this particular sequence. Mutagen-induced supF spectra can thus now readily be compared with a panel of spontaneous supF mutation spectra for hotspot assessment and deduction of the underlying potential background. We recommend that mutagen-induced mutation spectra should be compared with spontaneous (background) spectra of the same cell line but also that consideration is given to all of the spontaneous mutation spectra presented here, given the variability in distribution of supF spontaneous mutations within and between the different cell lines and the lack of any clear cell line/tissue/species mutation patterns. In addition to comparisons with chemically induced mutation spectra, the extent of the spontaneous mutation data described here will also be a useful resource for determination of the various causes and sequence specificities of spontaneous mutations, especially in mammalian cell lines. In particular, comparisons of the spontaneous mutation spectra with those arising after treatment with oxidizing agents would give insight into the contribution of oxidative damage to the overall spontaneous mutation spectrum. Finally, the spontaneous supF mutation data may be compared with similar data from other genes widely used in in vitro mutagenesis studies, such as hprt, obtainable from The MGMD (Lewis et al., 2000), as well as mutations known to cause inherited human disease (Krawczak and Cooper, 1997). Table I. supF spontaneous mutant data derived from 20 sources available in the MGMD (Lewis et al., 2000) Cell line  Code  Species  Origin  Plasmid  No. of base substitutions  Mutation frequency (×10–4)  Reference  Data were first sorted by cell line and then separated into within cell line subgroups. For clarity and ease of analysis each cell line subgroup was coded, where the first letter describes the species and the second letter describes the tissue of origin. All cell lines are assumed to be nucleotide excision repair (NER) proficient with the exception of XP12BE-SV (HF3) and 2-0-A2 (HF5), expressing the XP-A phenotype, rendering them extremely sensitive to UV and chemical DNA-damaging agents. H, human; M, monkey; Ms, mouse; kid, kidney; lym, lymphoblast; fib, fibroblast.  aMutation frequency recorded as a percentage by authors.  Ad293  HK1  H  kid  pS189  5  0.5  Bigger et al. (1990)  Ad293  HK2  H  kid  pS189  9  0.2  Bigger et al. (1992)  Ad293  HK3  H  kid  pS189  13  0.8  Boldt et al. (1991)  Ad293  HK4  H  kid  pSP189  50  1.1  Jeudes and Wogan (1996)  GM606  HL1  H  lym  pZ189  108  1.1  Jaberaboansari et al. (1991)  GM606  HL2  H  lym  pZ189  44  1.0  Sikpi et al. (1991)  GM0637  HF1  H  fib  pSP189  31  0.1–0.3%a  Myrand et al. (1996)  GM0637B  HF2  H  fib  pZ189  37  0.07%a  Seidman et al. (1987)  XP12BE-SV  HF3  H  fib  pSP189  30  0.1–0.3%a  Myrand et al. (1996)  XAN1  HF4  H  fib  pSP189  22  0.1–0.3%a  Myrand et al. (1996)  2–0-A2  HF5  H  fib  pSP189  19  0.1–0.3%a  Myrand et al. (1996)  W138-VA13  HF6  H  fib  pMY189  32  N/A  Kawanishi et al. (1998)  CV-1  MK1  M  kid  pZ189  13  11.1  Akman et al. (1991)  CV-1  MK2  M  kid  pZ189  211  0.03%a  Moraes et al. (1989)  CV-1 (TC-7)  MK3  M  kid  pZ189  56  3.7  Hauser et al. (1987)  COS-7  MK4  M  kid  p3AC  9  1.4  Yang et al. (1987)  COS-7  MK5  M  kid  pMY189  31  8.2  Murata-kamiya et al. (1997)  LN12  MsF  Ms  fib  λsupF  32  3.2  Leach et al. (1996)  Transgenic  MsS  Ms  skin  λsupF  42  0.2  Leach et al. (1996)  Transgenic  MsL  Ms  liver  λsupF  46  0.2  Leach et al. (1996)  Cell line  Code  Species  Origin  Plasmid  No. of base substitutions  Mutation frequency (×10–4)  Reference  Data were first sorted by cell line and then separated into within cell line subgroups. For clarity and ease of analysis each cell line subgroup was coded, where the first letter describes the species and the second letter describes the tissue of origin. All cell lines are assumed to be nucleotide excision repair (NER) proficient with the exception of XP12BE-SV (HF3) and 2-0-A2 (HF5), expressing the XP-A phenotype, rendering them extremely sensitive to UV and chemical DNA-damaging agents. H, human; M, monkey; Ms, mouse; kid, kidney; lym, lymphoblast; fib, fibroblast.  aMutation frequency recorded as a percentage by authors.  Ad293  HK1  H  kid  pS189  5  0.5  Bigger et al. (1990)  Ad293  HK2  H  kid  pS189  9  0.2  Bigger et al. (1992)  Ad293  HK3  H  kid  pS189  13  0.8  Boldt et al. (1991)  Ad293  HK4  H  kid  pSP189  50  1.1  Jeudes and Wogan (1996)  GM606  HL1  H  lym  pZ189  108  1.1  Jaberaboansari et al. (1991)  GM606  HL2  H  lym  pZ189  44  1.0  Sikpi et al. (1991)  GM0637  HF1  H  fib  pSP189  31  0.1–0.3%a  Myrand et al. (1996)  GM0637B  HF2  H  fib  pZ189  37  0.07%a  Seidman et al. (1987)  XP12BE-SV  HF3  H  fib  pSP189  30  0.1–0.3%a  Myrand et al. (1996)  XAN1  HF4  H  fib  pSP189  22  0.1–0.3%a  Myrand et al. (1996)  2–0-A2  HF5  H  fib  pSP189  19  0.1–0.3%a  Myrand et al. (1996)  W138-VA13  HF6  H  fib  pMY189  32  N/A  Kawanishi et al. (1998)  CV-1  MK1  M  kid  pZ189  13  11.1  Akman et al. (1991)  CV-1  MK2  M  kid  pZ189  211  0.03%a  Moraes et al. (1989)  CV-1 (TC-7)  MK3  M  kid  pZ189  56  3.7  Hauser et al. (1987)  COS-7  MK4  M  kid  p3AC  9  1.4  Yang et al. (1987)  COS-7  MK5  M  kid  pMY189  31  8.2  Murata-kamiya et al. (1997)  LN12  MsF  Ms  fib  λsupF  32  3.2  Leach et al. (1996)  Transgenic  MsS  Ms  skin  λsupF  42  0.2  Leach et al. (1996)  Transgenic  MsL  Ms  liver  λsupF  46  0.2  Leach et al. (1996)  View Large Table II. Types of spontaneous base substitutions in supF observed for each cell line and subgroup   Transitions  Transversions    GC→AT  AT→GC  GC→CG  GC→TA  AT→CG  AT→TA  The initial figures are the percentage of that type of substitution observed for that cell line/subgroup, whereas the following figure in parentheses is the actual number of substitutions observed from that experiment. Figures in bold show the most frequent type of base substitution observed.  HK1  80.0 (4)  0 (0)  0 (0)  0 (0)  0 (0)  20.0 (1)  HK2  0 (0)  0 (0)  44.4 (4)  33.3 (3)  0 (0)  22.2 (2)  HK3  7.7 (1)  0 (0)  7.7 (1)  69.2 (9)  15.4 (2)  0 (0)  HK4  46.0 (23)  0 (0)  14.0 (7)  34.0 (17)  2.0 (1)  4.0 (2)  HL1  61.1 (66)  0 (0)  14.8 (16)  23.1 (25)  0.9 (1)  0 (0)  HL2  50.0 (22)  0 (0)  15.9 (7)  31.8 (14)  0 (0)  2.3 (1)  HF1  67.7 (21)  0 (0)  6.5 (2)  19.4 (6)  3.2 (1)  3.2 (1)  HF2  27.0 (10)  2.7 (1)  24.3 (9)  43.2 (16)  0 (0)  2.7 (1)  HF3  56.7 (17)  0 (0)  10.0 (3)  23.3 (7)  10 (3)  0 (0)  HF4  50.0 (11)  0 (0)  13.6 (3)  36.4 (8)  0 (0)  0 (0)  HF5  47.4 (9)  0 (0)  31.6 (6)  15.8 (3)  5.3 (1)  0 (0)  HF6  28.1 (9)  3.1 (1)  21.9 (7)  46.9 (15)  0 (0)  0 (0)  MK1  23.1 (3)  0 (0)  15.4 (2)  23.1 (3)  15.4 (2)  23.1 (3)  MK2  48.8 (103)  0.5 (1)  19.0 (40)  30.3 (64)  0.5 (1)  0.9 (2)  MK3  64.3 (36)  0 (0)  3.6 (2)  30.4 (17)  0 (0)  1.8 (1)  MK4  44.4 (4)  11.1 (1)  11.1 (1)  11.1 (1)  22.2 (2)  0 (0)  MK5  3.2 (1)  0 (0)  32.3 (10)  64.5 (20)  0 (0)  0 (0)  MsF  65.6 (21)  6.3 (2)  3.1 (1)  21.9 (7)  3.1 (1)  0 (0)  MsS  66.7 (28)  4.8 (2)  4.8 (2)  14.3 (6)  7.1 (3)  2.4 (1)  MsL  37.0 (17)  6.5 (3)  2.2 (1)  39.1 (18)  10.9 (5)  4.3 (2)  Total  47.9 (401)  1.3 (11)  14.9 (125)  31.2 (261)  2.6 (22)  2.0 (17)    Transitions  Transversions    GC→AT  AT→GC  GC→CG  GC→TA  AT→CG  AT→TA  The initial figures are the percentage of that type of substitution observed for that cell line/subgroup, whereas the following figure in parentheses is the actual number of substitutions observed from that experiment. Figures in bold show the most frequent type of base substitution observed.  HK1  80.0 (4)  0 (0)  0 (0)  0 (0)  0 (0)  20.0 (1)  HK2  0 (0)  0 (0)  44.4 (4)  33.3 (3)  0 (0)  22.2 (2)  HK3  7.7 (1)  0 (0)  7.7 (1)  69.2 (9)  15.4 (2)  0 (0)  HK4  46.0 (23)  0 (0)  14.0 (7)  34.0 (17)  2.0 (1)  4.0 (2)  HL1  61.1 (66)  0 (0)  14.8 (16)  23.1 (25)  0.9 (1)  0 (0)  HL2  50.0 (22)  0 (0)  15.9 (7)  31.8 (14)  0 (0)  2.3 (1)  HF1  67.7 (21)  0 (0)  6.5 (2)  19.4 (6)  3.2 (1)  3.2 (1)  HF2  27.0 (10)  2.7 (1)  24.3 (9)  43.2 (16)  0 (0)  2.7 (1)  HF3  56.7 (17)  0 (0)  10.0 (3)  23.3 (7)  10 (3)  0 (0)  HF4  50.0 (11)  0 (0)  13.6 (3)  36.4 (8)  0 (0)  0 (0)  HF5  47.4 (9)  0 (0)  31.6 (6)  15.8 (3)  5.3 (1)  0 (0)  HF6  28.1 (9)  3.1 (1)  21.9 (7)  46.9 (15)  0 (0)  0 (0)  MK1  23.1 (3)  0 (0)  15.4 (2)  23.1 (3)  15.4 (2)  23.1 (3)  MK2  48.8 (103)  0.5 (1)  19.0 (40)  30.3 (64)  0.5 (1)  0.9 (2)  MK3  64.3 (36)  0 (0)  3.6 (2)  30.4 (17)  0 (0)  1.8 (1)  MK4  44.4 (4)  11.1 (1)  11.1 (1)  11.1 (1)  22.2 (2)  0 (0)  MK5  3.2 (1)  0 (0)  32.3 (10)  64.5 (20)  0 (0)  0 (0)  MsF  65.6 (21)  6.3 (2)  3.1 (1)  21.9 (7)  3.1 (1)  0 (0)  MsS  66.7 (28)  4.8 (2)  4.8 (2)  14.3 (6)  7.1 (3)  2.4 (1)  MsL  37.0 (17)  6.5 (3)  2.2 (1)  39.1 (18)  10.9 (5)  4.3 (2)  Total  47.9 (401)  1.3 (11)  14.9 (125)  31.2 (261)  2.6 (22)  2.0 (17)  View Large Table III. The probabilities of significant differences for each pair-wise comparison of all spontaneous mutation spectra after application of the Adams–Skopek test for comparing the mutation distribution in two mutation spectra Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001. The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    1.000  0.548  0.156  0.001  0.308  0.038  0.001  0.352  0.481  0.296  0.001  0.771  0.151  0.551  0.384  0.001  0.477  0.089  0.071  HK2  1.000    0.981  0.271  0.024  0.767  0.932  0.875  0.568  0.278  0.898  0.111  0.000  0.114  0.702  1.000  0.002  0.742  0.075  0.144  HK3  0.548  0.981    0.135  0.024  0.301  0.376  0.032  0.578  0.402  0.472  0.023  0.011  0.012  0.225  0.766  0.006  0.370  0.096  0.082  HK4  0.156  0.271  0.135    0.000  0.036  0.041  0.061  0.010  0.356  0.347  0.004  0.000  0.000  0.003  0.000  0.000  0.069  0.000  0.000  HL1  0.001  0.024  0.024  0.000    0.000  0.000  0.000  0.000  0.040  0.070  0.388  0.000  0.000  0.000  0.019  0.000  0.020  0.000  0.000  HL2  0.308  0.767  0.301  0.036  0.000    0.099  0.233  0.107  0.293  0.322  0.012  0.031  0.170  0.664  0.208  0.000  0.747  0.002  0.037  HF1  0.038  0.932  0.376  0.041  0.000  0.099    0.013  0.000  0.002  0.541  0.059  0.047  0.000  0.241  0.110  0.000  0.077  0.002  0.026  HF2  0.001  0.875  0.032  0.061  0.000  0.233  0.013    0.772  0.644  0.622  0.001  0.001  0.003  0.221  0.142  0.000  0.006  0.000  0.001  HF3  0.352  0.568  0.578  0.010  0.000  0.107  0.000  0.772    0.548  0.621  0.067  0.016  0.000  0.757  0.316  0.000  0.273  0.000  0.005  HF4  0.481  0.278  0.402  0.356  0.040  0.293  0.002  0.644  0.548    0.634  0.239  0.206  0.004  0.169  0.782  0.000  0.099  0.000  0.004  HF5  0.296  0.898  0.472  0.347  0.070  0.322  0.541  0.622  0.621  0.634    0.240  0.022  0.248  0.005  0.634  0.000  0.049  0.000  0.014  HF6  0.001  0.111  0.023  0.004  0.388  0.012  0.059  0.001  0.067  0.239  0.240    0.001  0.002  0.001  0.151  0.000  0.023  0.000  0.000  MK1  0.771  0.000  0.011  0.000  0.000  0.031  0.047  0.001  0.016  0.206  0.022  0.001    0.000  0.008  0.005  0.000  0.083  0.005  0.010  MK2  0.151  0.114  0.012  0.000  0.000  0.170  0.000  0.003  0.000  0.004  0.248  0.002  0.000    0.601  0.082  0.000  0.019  0.000  0.000  MK3  0.551  0.702  0.225  0.003  0.000  0.664  0.241  0.221  0.757  0.169  0.005  0.001  0.008  0.601    0.147  0.000  0.732  0.022  0.042  MK4  0.384  1.000  0.766  0.000  0.019  0.208  0.110  0.142  0.316  0.782  0.634  0.151  0.005  0.082  0.147    0.008  0.487  0.014  0.189  MK5  0.001  0.002  0.006  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.008    0.000  0.000  0.000  MsF  0.477  0.742  0.370  0.069  0.020  0.747  0.077  0.006  0.273  0.099  0.049  0.023  0.083  0.019  0.732  0.487  0.000    0.007  0.287  MsS  0.089  0.075  0.096  0.000  0.000  0.002  0.002  0.000  0.000  0.000  0.000  0.000  0.005  0.000  0.022  0.014  0.000  0.007    0.938  MsL  0.071  0.144  0.082  0.000  0.000  0.037  0.026  0.001  0.005  0.004  0.014  0.000  0.010  0.000  0.042  0.189  0.000  0.287  0.938      5  3  6  12  17  7  11  11  8  6  6  12  16  13  8  5  19  7  15  13  Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001. The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    1.000  0.548  0.156  0.001  0.308  0.038  0.001  0.352  0.481  0.296  0.001  0.771  0.151  0.551  0.384  0.001  0.477  0.089  0.071  HK2  1.000    0.981  0.271  0.024  0.767  0.932  0.875  0.568  0.278  0.898  0.111  0.000  0.114  0.702  1.000  0.002  0.742  0.075  0.144  HK3  0.548  0.981    0.135  0.024  0.301  0.376  0.032  0.578  0.402  0.472  0.023  0.011  0.012  0.225  0.766  0.006  0.370  0.096  0.082  HK4  0.156  0.271  0.135    0.000  0.036  0.041  0.061  0.010  0.356  0.347  0.004  0.000  0.000  0.003  0.000  0.000  0.069  0.000  0.000  HL1  0.001  0.024  0.024  0.000    0.000  0.000  0.000  0.000  0.040  0.070  0.388  0.000  0.000  0.000  0.019  0.000  0.020  0.000  0.000  HL2  0.308  0.767  0.301  0.036  0.000    0.099  0.233  0.107  0.293  0.322  0.012  0.031  0.170  0.664  0.208  0.000  0.747  0.002  0.037  HF1  0.038  0.932  0.376  0.041  0.000  0.099    0.013  0.000  0.002  0.541  0.059  0.047  0.000  0.241  0.110  0.000  0.077  0.002  0.026  HF2  0.001  0.875  0.032  0.061  0.000  0.233  0.013    0.772  0.644  0.622  0.001  0.001  0.003  0.221  0.142  0.000  0.006  0.000  0.001  HF3  0.352  0.568  0.578  0.010  0.000  0.107  0.000  0.772    0.548  0.621  0.067  0.016  0.000  0.757  0.316  0.000  0.273  0.000  0.005  HF4  0.481  0.278  0.402  0.356  0.040  0.293  0.002  0.644  0.548    0.634  0.239  0.206  0.004  0.169  0.782  0.000  0.099  0.000  0.004  HF5  0.296  0.898  0.472  0.347  0.070  0.322  0.541  0.622  0.621  0.634    0.240  0.022  0.248  0.005  0.634  0.000  0.049  0.000  0.014  HF6  0.001  0.111  0.023  0.004  0.388  0.012  0.059  0.001  0.067  0.239  0.240    0.001  0.002  0.001  0.151  0.000  0.023  0.000  0.000  MK1  0.771  0.000  0.011  0.000  0.000  0.031  0.047  0.001  0.016  0.206  0.022  0.001    0.000  0.008  0.005  0.000  0.083  0.005  0.010  MK2  0.151  0.114  0.012  0.000  0.000  0.170  0.000  0.003  0.000  0.004  0.248  0.002  0.000    0.601  0.082  0.000  0.019  0.000  0.000  MK3  0.551  0.702  0.225  0.003  0.000  0.664  0.241  0.221  0.757  0.169  0.005  0.001  0.008  0.601    0.147  0.000  0.732  0.022  0.042  MK4  0.384  1.000  0.766  0.000  0.019  0.208  0.110  0.142  0.316  0.782  0.634  0.151  0.005  0.082  0.147    0.008  0.487  0.014  0.189  MK5  0.001  0.002  0.006  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.008    0.000  0.000  0.000  MsF  0.477  0.742  0.370  0.069  0.020  0.747  0.077  0.006  0.273  0.099  0.049  0.023  0.083  0.019  0.732  0.487  0.000    0.007  0.287  MsS  0.089  0.075  0.096  0.000  0.000  0.002  0.002  0.000  0.000  0.000  0.000  0.000  0.005  0.000  0.022  0.014  0.000  0.007    0.938  MsL  0.071  0.144  0.082  0.000  0.000  0.037  0.026  0.001  0.005  0.004  0.014  0.000  0.010  0.000  0.042  0.189  0.000  0.287  0.938      5  3  6  12  17  7  11  11  8  6  6  12  16  13  8  5  19  7  15  13  View Large Table IV. Significant hotspots (at the 5% significance level), after application of the χ2 heterogeneity test for each cell line subgroup (HK1–MsL) and pooled data (Total) The first column shows the nucleotides where a significant hotspot was observed in one or more spectra. Hotspots significant at the a priori probability level are represented by gray and solid blocks. Hotspots significant at the a posteriori probability level are represented by solid blocks. View Large Table V. The probabilities of significant differences for each pair-wise comparison for cell line subgroups for the types of spontaneous substitutions after application of the Adams–Skopek test Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001.The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    0.009  0.002  0.222  0.036  0.083  0.326  0.075  0.139  0.064  0.281  0.023  0.259  0.061  0.063  0.715  0.000  0.278  0.433  0.211  HK2  0.009    0.029  0.022  0.000  0.004  0.006  0.061  0.001  0.000  0.079  0.021  0.312  0.012  0.000  0.023  0.044  0.000  0.000  0.001  HK3  0.002  0.029    0.021  0.001  0.004  0.002  0.049  0.012  0.007  0.000  0.042  0.134  0.004  0.001  0.042  0.061  0.007  0.001  0.182  HK4  0.222  0.022  0.021    0.095  0.981  0.634  0.293  0.287  0.921  0.371  0.193  0.029  0.434  0.116  0.029  0.000  0.166  0.023  0.044  HL1  0.036  0.000  0.001  0.095    0.387  0.343  0.000  0.074  0.934  0.084  0.001  0.000  0.000  0.068  0.002  0.000  0.021  0.005  0.000  HL2  0.083  0.004  0.004  0.981  0.387    0.569  0.144  0.169  0.959  0.344  0.128  0.003  0.961  0.179  0.012  0.000  0.108  0.023  0.008  HF1  0.326  0.006  0.002  0.634  0.343  0.569    0.006  0.759  0.538  0.484  0.012  0.056  0.269  0.260  0.135  0.000  0.481  0.642  0.021  HF2  0.075  0.061  0.049  0.293  0.000  0.144  0.006    0.008  0.405  0.147  1.000  0.016  0.112  0.000  0.029  0.028  0.011  0.000  0.012  HF3  0.139  0.001  0.012  0.287  0.074  0.169  0.759  0.008    0.403  0.116  0.015  0.048  0.019  0.071  0.354  0.000  0.499  0.587  0.108  HF4  0.064  0.000  0.007  0.921  0.934  0.959  0.538  0.405  0.403    0.211  0.355  0.025  0.851  0.301  0.059  0.000  0.316  0.103  0.109  HF5  0.281  0.079  0.000  0.371  0.084  0.344  0.484  0.147  0.116  0.211    0.098  0.135  0.268  0.005  0.116  0.001  0.045  0.052  0.002  HF6  0.023  0.021  0.042  0.193  0.001  0.128  0.012  1.000  0.015  0.355  0.098    0.006  0.191  0.001  0.022  0.018  0.017  0.001  0.014  MK1  0.259  0.312  0.134  0.029  0.000  0.003  0.056  0.016  0.048  0.025  0.135  0.006    0.001  0.001  0.535  0.000  0.124  0.026  0.078  MK2  0.061  0.012  0.004  0.434  0.000  0.961  0.269  0.112  0.019  0.851  0.268  0.191  0.001    0.068  0.003  0.000  0.032  0.001  0.000  MK3  0.063  0.000  0.001  0.116  0.068  0.179  0.260  0.000  0.071  0.301  0.005  0.001  0.001  0.068    0.001  0.000  0.198  0.064  0.004  MK4  0.715  0.023  0.042  0.029  0.002  0.012  0.135  0.029  0.354  0.059  0.116  0.022  0.535  0.003  0.001    0.000  0.351  0.564  0.430  MK5  0.000  0.044  0.061  0.000  0.000  0.000  0.000  0.028  0.000  0.000  0.001  0.018  0.000  0.000  0.000  0.000    0.000  0.000  0.000  MsF  0.278  0.000  0.007  0.166  0.021  0.108  0.481  0.011  0.499  0.316  0.045  0.017  0.124  0.032  0.198  0.351  0.000    0.818  0.442  MsS  0.433  0.000  0.001  0.023  0.005  0.023  0.642  0.000  0.587  0.103  0.052  0.001  0.026  0.001  0.064  0.564  0.000  0.818    0.056  MsL  0.211  0.001  0.182  0.044  0.000  0.008  0.021  0.012  0.108  0.109  0.002  0.014  0.078  0.000  0.004  0.430  0.000  0.442  0.056      5  16  16  7  12  7  6  11  7  4  5  13  11  10  9  10  18  8  10  11  Each value is the probability that the two spectra are different. Figures in bold highlight those spectra that were significantly different at the 5% significance level. A probability value of 0.000 indicatesP < 0.001.The bottom row shows the total number of significant differences for each spectrum.    HK1  HK2  HK3  HK4  HL1  HL2  HF1  HF2  HF3  HF4  HF5  HF6  MK1  MK2  MK3  MK4  MK5  MsF  MsS  MsL  HK1    0.009  0.002  0.222  0.036  0.083  0.326  0.075  0.139  0.064  0.281  0.023  0.259  0.061  0.063  0.715  0.000  0.278  0.433  0.211  HK2  0.009    0.029  0.022  0.000  0.004  0.006  0.061  0.001  0.000  0.079  0.021  0.312  0.012  0.000  0.023  0.044  0.000  0.000  0.001  HK3  0.002  0.029    0.021  0.001  0.004  0.002  0.049  0.012  0.007  0.000  0.042  0.134  0.004  0.001  0.042  0.061  0.007  0.001  0.182  HK4  0.222  0.022  0.021    0.095  0.981  0.634  0.293  0.287  0.921  0.371  0.193  0.029  0.434  0.116  0.029  0.000  0.166  0.023  0.044  HL1  0.036  0.000  0.001  0.095    0.387  0.343  0.000  0.074  0.934  0.084  0.001  0.000  0.000  0.068  0.002  0.000  0.021  0.005  0.000  HL2  0.083  0.004  0.004  0.981  0.387    0.569  0.144  0.169  0.959  0.344  0.128  0.003  0.961  0.179  0.012  0.000  0.108  0.023  0.008  HF1  0.326  0.006  0.002  0.634  0.343  0.569    0.006  0.759  0.538  0.484  0.012  0.056  0.269  0.260  0.135  0.000  0.481  0.642  0.021  HF2  0.075  0.061  0.049  0.293  0.000  0.144  0.006    0.008  0.405  0.147  1.000  0.016  0.112  0.000  0.029  0.028  0.011  0.000  0.012  HF3  0.139  0.001  0.012  0.287  0.074  0.169  0.759  0.008    0.403  0.116  0.015  0.048  0.019  0.071  0.354  0.000  0.499  0.587  0.108  HF4  0.064  0.000  0.007  0.921  0.934  0.959  0.538  0.405  0.403    0.211  0.355  0.025  0.851  0.301  0.059  0.000  0.316  0.103  0.109  HF5  0.281  0.079  0.000  0.371  0.084  0.344  0.484  0.147  0.116  0.211    0.098  0.135  0.268  0.005  0.116  0.001  0.045  0.052  0.002  HF6  0.023  0.021  0.042  0.193  0.001  0.128  0.012  1.000  0.015  0.355  0.098    0.006  0.191  0.001  0.022  0.018  0.017  0.001  0.014  MK1  0.259  0.312  0.134  0.029  0.000  0.003  0.056  0.016  0.048  0.025  0.135  0.006    0.001  0.001  0.535  0.000  0.124  0.026  0.078  MK2  0.061  0.012  0.004  0.434  0.000  0.961  0.269  0.112  0.019  0.851  0.268  0.191  0.001    0.068  0.003  0.000  0.032  0.001  0.000  MK3  0.063  0.000  0.001  0.116  0.068  0.179  0.260  0.000  0.071  0.301  0.005  0.001  0.001  0.068    0.001  0.000  0.198  0.064  0.004  MK4  0.715  0.023  0.042  0.029  0.002  0.012  0.135  0.029  0.354  0.059  0.116  0.022  0.535  0.003  0.001    0.000  0.351  0.564  0.430  MK5  0.000  0.044  0.061  0.000  0.000  0.000  0.000  0.028  0.000  0.000  0.001  0.018  0.000  0.000  0.000  0.000    0.000  0.000  0.000  MsF  0.278  0.000  0.007  0.166  0.021  0.108  0.481  0.011  0.499  0.316  0.045  0.017  0.124  0.032  0.198  0.351  0.000    0.818  0.442  MsS  0.433  0.000  0.001  0.023  0.005  0.023  0.642  0.000  0.587  0.103  0.052  0.001  0.026  0.001  0.064  0.564  0.000  0.818    0.056  MsL  0.211  0.001  0.182  0.044  0.000  0.008  0.021  0.012  0.108  0.109  0.002  0.014  0.078  0.000  0.004  0.430  0.000  0.442  0.056      5  16  16  7  12  7  6  11  7  4  5  13  11  10  9  10  18  8  10  11  View Large Table VI. Rank orders of dinucleotides arranged, left to right, from most mutable to least mutable after application of nearest neighbour analysis (Cooper and Krawczak, 1990)   Most mutable dinucleotide →  Least mutable dinucleotide(s)  For example, for subgroup HF4 the dinucleotide CC was most mutable relative to all other dinucleotides, followed by TC. Dinucleotides separated by / had the same dinucleotide relative mutability.  HK1  GG>  GA>TT>AG>CG>TC>  AA/AC/AT/GC/GT/CA/CC/CT/TA/TG  HK2  GA>  AG>GG>AA/CG>TC>AT>CT>CC>  AC/GC/GT/CA/TA/TG/TT  HK3  GG>  TG>GC>CT>AA>CA>CG>TC>CC>AG>GA>  AC/AT/GT/TA/TT  HK4  GA>  AG>GC>CG>GG>TG>TC>CT>CC>AC/GT>TT>CA>  AA/AT/TA  HL1  TC>  CG>CT>CC>CA>GA>GG>AG>AA>GC>  AC/AT/GT/TA/TG/TT  HL2  GG>  TG>CG>GA>AG>TC>CC>CT>GC>AC>CA>  AA/AT/GT/TA/TT  HF1  GA>  AG/GG>CG>CT>TC>CC>AC/GC>CA>  AA/AT/GT/TA/TG/TT  HF2  CG>  TC>GA>CT>AG>GT>CC>CA>GG>AA>AT>AC/GC>  TA/TG/TT  HF3  TC>  CG>CC>CA>GA>GG>AC>CT>GC/GT>AG>  AA/AT/TA/TG/TT  HF4  CC>  TC>CG>GG>CT>GA>GC>AG>  AA/AC/AT/GT/CA/TA/TG/TT  HF5  CT>  TC>GG>CG/GA>AG>TT>CC>CA>  AA/AC/AT/GC/GT/TA/TG  HF6  TC>  CG>CT>CC>GA>AC/GG/TT>CA>  AA/AG/AT/GC/GT/TA/TG  MK1  TA>  GT>CG>AA>TT>GG>TC>AT>CC>AC>CT>GA>  AG/GC/CA/TG  MK2  CG>  GG>GA>CT>TC>AG>CC>TG>CA>GT>AA>GC/TT>AC>  AT/TA  MK3  CG>  GA>GG/TG>CC>GT/TC>AG/CT>CA>GC>TT>  AA/AC/AT/TA  MK4  TT>  CG>GG>TC>CT>GT>GA>  AA/AG/AC/AT/GC/CA/CC/TA/TG  MK5  GT>  GG>CT>TC>CC>GA/CG>GC>AG>  AA/AC/AT/CA/TA/TG/TT  MsF  TG>  CG>GG>AG>TC>GA>GC>CC>AT>GT/TT>CT>AA>  AC/CA/TA  MsS  CG>  GG>GC>TG>GT>GA>AG/TA>TT>CA>AA>AC>TC>CT>  AT/CC  MsL  CG>  TG>GA>GC>GG>GT>TT>TA/AG>TC>AA>CA>CC>CT>  AT/AC  Total  CG>  GG>GA>TC>TG>CT>AG>CC>GT>GC>CA>TT>TA>AA>AC>  AT    Most mutable dinucleotide →  Least mutable dinucleotide(s)  For example, for subgroup HF4 the dinucleotide CC was most mutable relative to all other dinucleotides, followed by TC. Dinucleotides separated by / had the same dinucleotide relative mutability.  HK1  GG>  GA>TT>AG>CG>TC>  AA/AC/AT/GC/GT/CA/CC/CT/TA/TG  HK2  GA>  AG>GG>AA/CG>TC>AT>CT>CC>  AC/GC/GT/CA/TA/TG/TT  HK3  GG>  TG>GC>CT>AA>CA>CG>TC>CC>AG>GA>  AC/AT/GT/TA/TT  HK4  GA>  AG>GC>CG>GG>TG>TC>CT>CC>AC/GT>TT>CA>  AA/AT/TA  HL1  TC>  CG>CT>CC>CA>GA>GG>AG>AA>GC>  AC/AT/GT/TA/TG/TT  HL2  GG>  TG>CG>GA>AG>TC>CC>CT>GC>AC>CA>  AA/AT/GT/TA/TT  HF1  GA>  AG/GG>CG>CT>TC>CC>AC/GC>CA>  AA/AT/GT/TA/TG/TT  HF2  CG>  TC>GA>CT>AG>GT>CC>CA>GG>AA>AT>AC/GC>  TA/TG/TT  HF3  TC>  CG>CC>CA>GA>GG>AC>CT>GC/GT>AG>  AA/AT/TA/TG/TT  HF4  CC>  TC>CG>GG>CT>GA>GC>AG>  AA/AC/AT/GT/CA/TA/TG/TT  HF5  CT>  TC>GG>CG/GA>AG>TT>CC>CA>  AA/AC/AT/GC/GT/TA/TG  HF6  TC>  CG>CT>CC>GA>AC/GG/TT>CA>  AA/AG/AT/GC/GT/TA/TG  MK1  TA>  GT>CG>AA>TT>GG>TC>AT>CC>AC>CT>GA>  AG/GC/CA/TG  MK2  CG>  GG>GA>CT>TC>AG>CC>TG>CA>GT>AA>GC/TT>AC>  AT/TA  MK3  CG>  GA>GG/TG>CC>GT/TC>AG/CT>CA>GC>TT>  AA/AC/AT/TA  MK4  TT>  CG>GG>TC>CT>GT>GA>  AA/AG/AC/AT/GC/CA/CC/TA/TG  MK5  GT>  GG>CT>TC>CC>GA/CG>GC>AG>  AA/AC/AT/CA/TA/TG/TT  MsF  TG>  CG>GG>AG>TC>GA>GC>CC>AT>GT/TT>CT>AA>  AC/CA/TA  MsS  CG>  GG>GC>TG>GT>GA>AG/TA>TT>CA>AA>AC>TC>CT>  AT/CC  MsL  CG>  TG>GA>GC>GG>GT>TT>TA/AG>TC>AA>CA>CC>CT>  AT/AC  Total  CG>  GG>GA>TC>TG>CT>AG>CC>GT>GC>CA>TT>TA>AA>AC>  AT  View Large Table VII. Strand bias for mutations at cytosines (C) and guanines (G) in the NT strand for each cell line/subgroup     Observed  Expected  χ2  P  P values significant at the 5% significance level are shown in bold. A probability value of 0.000 indicates P < 0.001.  aA significant P value where there was a higher level of substitutions observed at cytosines in the NT strand.  HK1  C  0  2  4.909  0.027    G  4  2      HK2  C  2  4  1.992  0.158    G  5  3      HK3  C  6  6  0.097  0.755    G  4  4      HK4  C  13  25  12.498  0.000    G  32  20      HL1  C  91  59  42.286  0.000a    G  14  46      HL2  C  17  24  4.212  0.040    G  26  19      HF1  C  17  16  0.145  0.703    G  12  13      HF2  C  12  19  6.130  0.013    G  23  16      HF3  C  17  15  0.674  0.412    G  10  12      HF4  C  15  12  1.521  0.217    G  7  10      HF5  C  11  10  0.263  0.608    G  7  8      HF6  C  27  17  12.827  0.000a    G  4  14      MK1  C  4  4  0.012  0.914    G  3  3      MK2  C  75  116  31.907  0.000    G  135  94      MK3  C  20  30  7.806  0.005    G  35  25      MK4  C  2  3  0.461  0.497    G  3  2      MK5  C  10  17  5.746  0.017    G  20  13      MsF  C  9  15  5.966  0.015    G  19  13      MsS  C  10  20  10.864  0.001    G  26  16      MsL  C  12  20  6.896  0.009    G  24  16          Observed  Expected  χ2  P  P values significant at the 5% significance level are shown in bold. A probability value of 0.000 indicates P < 0.001.  aA significant P value where there was a higher level of substitutions observed at cytosines in the NT strand.  HK1  C  0  2  4.909  0.027    G  4  2      HK2  C  2  4  1.992  0.158    G  5  3      HK3  C  6  6  0.097  0.755    G  4  4      HK4  C  13  25  12.498  0.000    G  32  20      HL1  C  91  59  42.286  0.000a    G  14  46      HL2  C  17  24  4.212  0.040    G  26  19      HF1  C  17  16  0.145  0.703    G  12  13      HF2  C  12  19  6.130  0.013    G  23  16      HF3  C  17  15  0.674  0.412    G  10  12      HF4  C  15  12  1.521  0.217    G  7  10      HF5  C  11  10  0.263  0.608    G  7  8      HF6  C  27  17  12.827  0.000a    G  4  14      MK1  C  4  4  0.012  0.914    G  3  3      MK2  C  75  116  31.907  0.000    G  135  94      MK3  C  20  30  7.806  0.005    G  35  25      MK4  C  2  3  0.461  0.497    G  3  2      MK5  C  10  17  5.746  0.017    G  20  13      MsF  C  9  15  5.966  0.015    G  19  13      MsS  C  10  20  10.864  0.001    G  26  16      MsL  C  12  20  6.896  0.009    G  24  16      View Large Fig. 1. View largeDownload slide supF mutation spectra where the tRNA nucleotide sequence for the NT strand is represented on the x-axis (nt 99–183) and frequency of data (actual numbers) is shown on the y-axis. Each base substitution is colour coded: a nucleotide change to A is green, to G is red and to Y is blue. X can be A, C, G or T depending on the nucleotide position. (A) Pooled supF spontaneous mutation spectra. Each nucleotide position shows the number of base substitutions observed over all mutant data. The types of substitution are colour coded, for example, at C133 52 substitutions were observed: nine were C→A (green), 38 were C→G (red) and seven were C→T (blue). (B) Mutation spectrum for spontaneous substitutions after single-stranded supF was transfected into monkey kidney COS-7 cells. (C) Spectrum showing the number of individual mutation spectra contributing mutant data to each nucleotide site, for example, mutations at G123 were derived from 16 references, whereas no substitutions were recorded from any reference at A147. Fig. 1. View largeDownload slide supF mutation spectra where the tRNA nucleotide sequence for the NT strand is represented on the x-axis (nt 99–183) and frequency of data (actual numbers) is shown on the y-axis. Each base substitution is colour coded: a nucleotide change to A is green, to G is red and to Y is blue. X can be A, C, G or T depending on the nucleotide position. (A) Pooled supF spontaneous mutation spectra. Each nucleotide position shows the number of base substitutions observed over all mutant data. The types of substitution are colour coded, for example, at C133 52 substitutions were observed: nine were C→A (green), 38 were C→G (red) and seven were C→T (blue). (B) Mutation spectrum for spontaneous substitutions after single-stranded supF was transfected into monkey kidney COS-7 cells. (C) Spectrum showing the number of individual mutation spectra contributing mutant data to each nucleotide site, for example, mutations at G123 were derived from 16 references, whereas no substitutions were recorded from any reference at A147. Fig. 2. View largeDownload slide View largeDownload slide Individual spontaneous mutation spectra were derived for all sources as described in Table I. The axes and colour coding of substitutions in the spectra are described in Figure 1. Fig. 2. View largeDownload slide View largeDownload slide Individual spontaneous mutation spectra were derived for all sources as described in Table I. The axes and colour coding of substitutions in the spectra are described in Figure 1. 1 To whom correspondence should be addressed. Tel: +44 1792 205200; Fax: +44 1792 205200; Email: balewis@swan.ac.uk The studies described here were made possible by support from the Biological and Biotechnology Research Council, Otsuka Pharmaceuticals and BAT Ltd. References Adams,W.T. and Skopek,T.R. ( 1987) Statistical test for the comparison of samples from mutational spectra. J. Mol. Biol. , 194, 391–396. Google Scholar Akasaka,S., Takimoto,K. and Yamamoto,K. ( 1992) G:C→T:A and G:C→C:G transversions are the predominant spontaneous mutations in the Escherichia coli supF gene—an improved LacZ(Am) Escherichia coli host designed for assaying pZ189 supF mutational specificity. Mol. Gen. Genet. , 235, 173–178. Google Scholar Akman,S.A., Forrest,G.P., Doroshow,J.H. and Dizdaroglu,M. ( 1991) Mutation of potassium permanganate-treated and hydrogen peroxide-treated plasmid pZ189 replicating in CV-1 monkey kidney cells. Mutat. Res. , 261, 123–130. Google Scholar Ames,B.N. and Gold,L.S. ( 1991) Endogenous mutagens and the causes of aging and cancer. Mutat. Res. , 250, 3–16. Google Scholar Bigger,C.A.H., Flickinger,D.J., Strandberg,J., Pataki,J., Harvey,R.G. and Dipple,A. ( 1990) Mutational specificity of the anti-1,2-dihydrodiol 3,4-epoxide of 5-methylchrysene. Carcinogenesis , 11, 2263–2265. Google Scholar Bigger,C.A.H., Stjohn,J., Yagi,H., Jerina,D.M. and Dipple,A. ( 1992) Mutagenic specificities of 4 stereoisomeric benzo[c]phenanthrene dihydrodiol epoxides. Proc. Natl Acad. Sci. USA , 89, 368–372. Google Scholar Bigger,C.A.H., Ponten,I., Page,J.E. and Dipple,A. ( 2000) Mutational spectra for polycyclic aromatic hydrocarbons in the supF target gene. Mutat. Res. , 450, 75–93. Google Scholar Boldt,J., Mah,M.C.M., Wang,Y.C., Smith,B.A., Beland,F.A., Maher,V.M. and McCormick,J.J. ( 1991) Kinds of mutations found when a shuttle vector containing adducts of 1,6-dinitropyrene replicates in human cells. Carcinogenesis , 12, 119–126. Google Scholar Burcham,P.C. ( 1999) Internal hazards: baseline DNA damage by endogenous products of normal metabolism. Mutat. Res. , 443, 11–36. Google Scholar Cabral Neto,J.B., Gentil,A., Cabral,R.E.C. and Sarasin,A. ( 1992) Mutation spectrum of heat-induced abasic sites on a single-stranded shuttle vector replicated in mammalian cells. J. Biol. Chem. , 267, 19718–19723. Google Scholar Cabral Neto,J.B., Gentil,A., Cabral,R.E.C. and Sarasin,A. ( 1993) Implication of uracil in spontaneous mutagenesis on a single-stranded shuttle vector replicated in mammalian cells. Mutat. Res. , 288, 249–255. Google Scholar Canella,K.A. and Seidman, M.M. ( 2000) Mutation spectra in supF: approaches to elucidating sequence context effects. Mutat. Res. , 450, 61–73 Google Scholar Cariello,N.F., Piegorsch,W.W., Adams,W.T. and Skopek,T.R. ( 1994) Computer program for the analysis of mutational spectra—application to p53 mutations. Carcinogenesis , 15, 2281–2285. Google Scholar Cooper,D.N. and Krawczak,M. ( 1990) The mutational spectrum of single base-pair substitutions causing human genetic disease—patterns and predictions. Hum. Genet. , 85, 55–74. Google Scholar Dammann,R. and Pfeifer,G.P. ( 1997) Lack of gene- and strand-specific DNA repair in RNA polymerase III-transcribed human tRNA genes. Mol. Cell. Biol. , 17, 219–229. Google Scholar de Boer,J.G., Erfle,H., Holcroft,J., Walsh,D., Dycaico,M., Provost,S., Short,J. and Glickman,B.W. ( 1996) Spontaneous mutants recovered from liver and germ cell tissue of low copy number lacI transgenic rats. Mutat. Res. , 352, 73–78. Google Scholar de Boer,J.G., Provost,S., Gorelick,N., Tindall,K. and Glickman,B.W. ( 1998) Spontaneous mutation in lacI transgenic mice: a comparison of tissues. Mutagenesis , 13, 109–114. Google Scholar Douglas,G.R., Gingerich,J.D., Gossen,J.A. and Bartlett,S.A. ( 1994) Sequence spectra of spontaneous LacZ gene-mutations in transgenic mouse somatic and germline tissues. Mutagenesis , 9, 451–458. Google Scholar Eckert,K.A. and Opresko,P.L. ( 1999) DNA polymerase mutagenic bypass and proofreading of endogenous DNA lesions. Mutat. Res. , 424, 221–236. Google Scholar Friedberg, E.C., Walker, G.C. and Siede, W. (1995) DNA Repair and Mutagenesis. ASM Press, Washington, D.C. Google Scholar Fronza,G., Madzak,C., Campomenosi,P., Inga,A., Iannone,R., Abbondandolo,A. and Sarasin,A. ( 1994) Mutation spectrum of 4-nitroquinoline 1-oxide-damaged single-stranded shuttle vector DNA transfected into monkey cells. Mutat. Res. , 308, 117–125. Google Scholar Glickman,B.W., Saddi,V.A. and Curry,J. ( 1994) International Commission for Protection Against Environmental Mutagens and Carcinogens. Working paper no. 2. Spontaneous mutations in mammalian cells. Mutat. Res. , 304, 19–32. Google Scholar Harbach,P.R., Zimmer,D.M., Filipunas,A.L., Mattes,W.B. and Aaron,C.S. ( 1999) Spontaneous mutation spectrum at the lambda cll locus in liver, lung and spleen tissue of Big Blue® transgenic mice. Environ. Mol. Mutagen. , 33, 132–143. Google Scholar Hauser,J., Seidman,M.M., Sidur,K. and Dixon,K. ( 1986) Sequence specificity of point mutations induced during passage of a UV-irradiated shuttle vector plasmid in monkey cells. Mol. Cell. Biol. , 6, 277–285. Google Scholar Hauser,J., Levine,A.S. and Dixon,K. ( 1987) Unique pattern of point mutations arising after gene-transfer into mammalian cells. EMBO J. , 6, 63–67. Google Scholar Jaberaboansari,A., Dunn,W.C., Preston,R.J., Mitra,S. and Waters,L.C. ( 1991) Mutations induced by ionizing-radiation in a plasmid replicated in human cells. 2. Sequence-analysis of alpha-particle-induced point mutations. Radiat. Res. , 127, 202–210. Google Scholar Juedes,M.J. and Wogan,G.N. ( 1996) Peroxynitrite-induced mutation spectra of pSP189 following replication in bacteria and in human cells. Mutat. Res. , 349, 51–61. Google Scholar Kawanishi,M., Matsuda,T., Nakayama,A., Takebe,H., Matsui,S. and Yagi,T. ( 1998) Molecular analysis of mutations induced by acrolein in human fibroblast cells using supF shuttle vector plasmids. Mutat. Res. , 417, 65–73. Google Scholar Kraemer,K.H. and Seidman,M.M. ( 1989) Use of supF, an Escherichia coli tyrosine suppressor transfer-RNA gene as a mutagenic target in shuttle-vector plasmids. Mutat. Res. , 220, 61–72. Google Scholar Krawczak,M. and Cooper,D.N. ( 1997) The human gene mutation database. Trends Genet. , 13, 121–122. Google Scholar Khromov-Borisov,N.N., Rogozin,I.B., Henriques,J.A.P. and de Serres,F.J. ( 1999) Similarity pattern analysis in mutational distributions. Mutat. Res. , 430, 55–74. Google Scholar Leach,E.G., Narayanan,L., Havre,P.A., Gunther,E.J., Yeasky,T.M. and Glazer,P.M. ( 1996) Tissue specificity of spontaneous point mutations in lambda supF transgenic mice. Environ. Mol. Mutagen. , 28, 459–464. Google Scholar Lewis,P.D., Harvey,J.S., Waters,E.M. and Parry,J.M. ( 2000) The mammalian gene mutation database. Mutagenesis , 15, 411–414. Google Scholar Loeb,L.A. ( 1989) Endogenous carcinogenesis: molecular oncology into the twenty-first century—Presidential Address. Cancer Res. , 49, 5489–5496. Google Scholar Madzak,C., Cabralneto,J.B., Menck,C.F.M. and Sarasin,A. ( 1992) Spontaneous and ultraviolet-induced mutations on a single-stranded shuttle vector transfected into monkey cells. Mutat. Res. , 274, 135–145. Google Scholar Moraes,E.C., Keyse,S.M., Pidoux,M. and Tyrrell,R.M. ( 1989) The spectrum of mutations generated by passage of a hydrogen-peroxide damaged shuttle vector plasmid through a mammalian host. Nucleic Acids Res. , 17, 8301–8312. Google Scholar Murata-kamiya,N., Kamiya,H., Kaji,H. and Kasai,H. ( 1997) Glyoxal, a major product of DNA oxidation, induces mutations at G:C sites on a shuttle vector plasmid replicated in mammalian cells. Nucleic Acids Res. , 25, 1897–1902. Google Scholar Myrand,S.P., Topping,R.S. and States,J.C. ( 1996) Stable transformation of xeroderma pigmentosum group A cells with an XPA minigene restores normal DNA repair and mutagenesis of UV-treated plasmids. Carcinogenesis , 17, 1909–1917. Google Scholar Ono,T., Ikehata,H., Nakamura,S., Saito,Y., Komura,J., Hosoi,Y. and Yamamoto,K. ( 1999) Molecular nature of mutations induced by a high dose of X-rays in spleen, liver and brain of the lacZ-transgenic mouse. Environ. Mol. Mutagen. , 34, 97–105. Google Scholar Sarkar,S., Dasgupta,U.B. and Summers,W.C. ( 1984) Error-prone mutagenesis detected in mammalian cells by a shuttle vector containing the supF gene of Escherichia coli. Mol. Cell. Biol. , 4, 2227–2230. Google Scholar Seidman,M.M., Dixon,K., Razzaque,A., Zagursky,R.J. and Berman,M.L. ( 1985) A shuttle vector plasmid for studying carcinogen-induced point mutations in mammalian cells. Gene , 38, 233–237. Google Scholar Seidman,M.M., Bredberg,A., Seetharam,S. and Kraemer,K.H. ( 1987) Multiple point mutations in a shuttle vector propagated in human cells—evidence for an error-prone DNA-polymerase activity. Proc. Natl Acad. Sci. USA , 84, 4944–4948. Google Scholar Sikpi,M.O., Freedman,M.L., Ziobron,E.R., Upholt,W.B. and Lurie,A.G. ( 1991) Dependence of the mutation spectrum in a shuttle plasmid replicated in human lymphoblasts on dose of gamma-radiation. Int. J. Radiat. Biol. , 59, 1115–1126. Google Scholar Smith,K.C. ( 1992) Spontaneous mutagenesis—experimental, genetic and other factors. Mutat. Res. , 277, 139–162. Google Scholar Tarone,R.E. ( 1989) Testing for non-randomness of events in sparse data situations. Ann. Hum. Genet , 53, 381–387. Google Scholar Vineis,P., Malats,N., Porta,M. and Real,F.X. ( 1999) Human cancer, carcinogenic exposures and mutation spectra. Mutat. Res. , 436, 185–194. Google Scholar Yang,J.L., Maher,V.M. and McCormick,J.J. ( 1987) Kinds of mutations formed when a shuttle vector containing adducts of (+/–)-7-beta,8-alpha-dihydroxy-9-alpha,10-alpha-epoxy-7,8,9,10-tetrahydrobenzo[a]pyrene replicates in human cells. Proc. Natl Acad. Sci. USA , 84, 3787–3791. Google Scholar © UK Environmental Mutagen Society/Oxford University Press 2001

Journal

MutagenesisOxford University Press

Published: Nov 1, 2001

There are no references for this article.