Multiple, independent, identical IS6110 insertions in Mycobacterium tuberculosis
Christopher R.E. McEvoy
, Robin M. Warren, Paul D. van Helden, Nicolaas C. Gey van Pittius
DST/NRF Centre of Excellence for Biomedical Tuberculosis Research/MRC Centre for Molecular and Cellular Biology, Division of Molecular Biology and Human Genetics, Faculty of
Health Sciences, Stellenbosch University, PO Box 19063, Tygerberg, South Africa
Received 11 May 2009
Received in revised form
12 August 2009
Accepted 16 August 2009
IS6110 insertion sequence
IS6110 is a transposable element found in Mycobacterium tuberculosis complex members. Regions of
preferential IS6110 integration occur within the M. tuberculosis genome but the element has not previ-
ously been shown to exhibit any sequence-speciﬁc integration preferences. Here we provide evidence for
multiple independent IS6110 insertions into identical, or near-identical, positions within the highly
region of 3 PPE genes.
Ó 2009 Elsevier Ltd. All rights reserved.
Due to its high numerical and positional variability, IS6110 has
become a widely used genotypic marker in studies of Mycobacte-
rium tuberculosis epidemiology.
Numerous ‘‘hotspot-regions’’ for
IS6110 integration are known to be present within the M. tubercu-
losis genome. These include the PPE genes.
However, IS6110 does
not show any sequence-speciﬁc integration preference and it is
possible that the ‘‘hotspot-regions’’ may result from higher order
DNA structures. Alternatively, these regions may simply represent
parts of the genome that better tolerate disruption.
As part of a larger study aimed at determining PE and PPE gene
variation in M. tuberculosis we undertook detailed analysis of the
region spanning the PPE38 to PPE40 genes (Figure 1a). This region
comprises 2 identical PPE genes (PPE38 and PPE71) separated by an
888 bp region that contains 2 esat-6-like genes (esx), along with 2
PPE genes (PPE39 and PPE40) that demonstrate complete sequence
identify for the ﬁrst 538 bp of the gene before the sequences
diverge. The 5
domains of PPE39 and PPE40 also show over 80%
DNA sequence identity with the 5
domains of PPE38 and PPE71
(Figure 2). PCR and sequence analysis of the PPE38 gene in 40
clinical isolates representing a broad spectrum of M. tuberculosis
were conducted. Genetic relationships between strains
including their family (F) and cluster status were determined as
This analysis identiﬁed 4 isolates that
showed the presence of IS6110 at position þ51. One isolate (SAWC
2185, a member of the Haarlem F2 lineage) had undergone direct
IS6110 integration, as demonstrated by the duplication of a 4 bp
sequence (GGAT) at the site of integration. The other isolates, all
members of atypical or typical Beijing lineages (F31, 27 and 29),
revealed additional mutational events, presumably involving
homologous recombination, that had deleted large regions adja-
cent to the other end of the IS that included the ﬁrst 50 bp of PPE38.
The target duplication was thus absent in these 3 cases. The 3
domain of the gene was preserved in each case (Figure 1b, c, d).
Analysis of the PPE38, PPE39 and PPE40 genes in 21 publicly
available MTBC genome sequences (available from the databases
http://www.broad.mit.edu/; and http://www.ncbi.nlm.nih.gov/
sutils/genom_table.cgi) that contained a total of 206 IS6110
elements, identiﬁed 2 strains (Haarlem and F11) that showed IS6110
integration at position þ47 of PPE39 and 1 strain (CPHL_A) that
showed an identical IS6110 integration in PPE40. Direct integration
was seen in all 3 cases with a 3 bp sequence (GGA) duplicated at the
site of integration. An additional integration at PPE40 position þ47
was observed in the atypical Beijing isolate 02_1987. In this case the
IS had undergone recombination with another IS6110 element, as
end of the element was fused with Rv1929c (Figure 1e). This
gene is normally located over 450 kb from PPE40 and this mutation
(a large inversion) thus represents a major genomic rearrangement.
Gene alignments indicate that the common PPE38 (þ51 bp) and
PPE39/40 (þ47 bp) IS6110 insertion points differ by just 1 bp in the
equivalent sequence of these genes (Figure 2).
Corresponding author. Tel.: þ27 21 9389482.
E-mail addresses: firstname.lastname@example.org (C.R.E. McEvoy), email@example.com (R.M.
Warren), firstname.lastname@example.org (P.D. van Helden), email@example.com (N.C. Gey van Pittius).
Contents lists available at ScienceDirect
journal homepage: http://intl.elsevierhealth.com/journals/tube
1472-9792/$ – see front matter Ó 2009 Elsevier Ltd. All rights reserved.
Tuberculosis 89 (2009) 439–442