Source and component genes of a 6–200 Mb gene cluster in the
Michel van Geel,
Pieter J. de Jong,
Institut fu¨r Biologie, Medizinische Universita¨t zu Lu¨beck, Ratzeburger Allee 160, D-23538 Lu¨beck, Germany
Department of Dermatology, University Hospital Nijmegen, Building M321, PO box 9101, 6500 HB, Rene Descartesdreef 1, 6525 GL,
Nijmegen, The Netherlands
Children’s Hospital, Oakland Research Institute, BACPAC Resources, 747 52nd St., Oakland, California 94609, USA
Parke-Davis-Lab for Molecular Genetics, 1501 Bay Harbor Parkway, Alameda, California 94502, USA
Received: 1 February 2001 / Accepted: 17 April 2001
Abstract. We identified and analyzed the genes Sp100, Csprs,
and Ifi75 in two members of the genus Mus, M. musculus and M.
caroli. Sp100 is a nuclear dot gene; Csprs and Ifi75 are novel genes
encoding a putative G-protein coupled receptor (GPCR) and a
putative transcriptional coactivator, respectively. A fourth gene,
Sp100-rs, occurs in M. musculus, but not in M. caroli. Sp100-rs is
a chimeric gene which arose by fusion of Sp100 and Csprs copies.
Sp100-rs and Ifi75 are components of a repeat cluster that extends
over 6–200 Mb of the M. musculus genome. The Sp100-rs fusion
gene arose only 1–2 million years ago and has become fixed and
amplified in M. musculus. Although the gene is transcribed, it
appears to have no function. The repeat cluster may have become
fixed in the species as a ‘hitchhiker’ in a ‘selective sweep’.
Novel genes can arise from preexisting genes by genomic rear-
rangements. Processes like gene fusion or amplification followed
by diversification led to the formation of families of related genes
and contributed significantly to the evolution of genomes.
We studied the components that contributed by fusion and/or
amplification to the formation of an evolutionarily young gene
cluster, the Sp100-rs cluster, in the house mouse, Mus musculus.
The cluster is located in Chromosome (Chr) 1 of M. musculus,
harbors from about 60 to 2000 repeats, and encompasses 6–200
Mb or 0.2–6.7% of the haploid genome (Kunze et al. 1996). Clus-
ters of more than 200 copies are cytogenetically conspicuous as
homogeneously staining regions (Traut et al. 1984; Kunze et al.
The main cluster components are Sp100-rs (Weichenhan et al.
1998a) and Ifi75 (this paper). The absence of Sp100-rs and the
Sp100-rs cluster in a related mouse species, M. caroli, enabled us
to trace the origin of the cluster components. Sp100-rs is a chi-
meric gene, a fusion product of two source genes, the nuclear dot
gene Sp100 (Weichenhan et al. 1997) and the Csprs gene, which is
described in this paper. The two parts became fused at ancestral
breakpoints, the Sp100 breakpoint (designated Sp100-BP), which
is located in the third intron of Sp100, and the Csprs breakpoint
(designated Csprs-BP), located about 10 kb 5Ј of Csprs.
The fusion gene Sp100-rs has become fixed, highly amplified,
and diversified in M. musculus (Plass et al. 1995; Weichenhan et
al. 1998a). It forms an extensively variable gene family, but, al-
though it is ubiquitously transcribed (Weichenhan et al. 1995), we
did not find transcripts that would give rise to functional gene
Materials and methods
Mouse strains and chromosomes.
M. caroli was kindly provided by
Rosemary Elliott (Buffalo, N.Y.). M. musculus was represented by labo-
ratory strains C57BL/6 and MUT with about 60 and 920 Sp100-rs gene
copies, respectively (Kunze et al. 1996).
Primers are listed in Table 1. They were custom-made by MWG
Biotech (Ebersberg) or TIB Molbiol (Berlin).
Unless otherwise stated, the Expand kit from Roche Diagnostics
(Mannheim, Germany) was used according to the instructions of the manu-
facturer. Rapid amplification of cDNA ends (RACE) with RACE primers
(Frohman 1993) and RT-PCR were either carried out with
total or poly(A)
RNA from M. caroli. In the first PCR of 5Ј RACE, Taq
Polymerase from Life Technologies (Eggenstein, Germany) was used in-
stead of Expand. To amplify the 5Ј end of the Csprs transcript from total
RNA of M. caroli spleen, we used primers 10C (first-strand synthesis at
54°C for1hwithSuperscript II from Life Technologies), Ex5arev2 with
(first PCR, annealing at 55°C for 1 min, extension for 3 min, 40
cycles), and 4-5aB1 with Q
(second PCR, annealing at 55°C for 30 s,
extension for 5 min, 30 cycles). Primary and secondary 3Ј RACE of the
Csprs gene were performed with primer combinations altEx4F1/Q
, respectively (same conditions: annealing at 60°C for 30 s,
extension for 10 min, 30 cycles). For RT-PCR, primers altEx4F1 and
Mcg2B4 were used (annealing at 55°C for 30 s, extension for 5 min, 30
cycles). The 5Ј end of the Ifi75 transcript was amplified from mRNA of M.
caroli liver with primers G2homB1 (first-strand synthesis as described
above), G2homB2 with Q
(first PCR, annealing at 58°C for 1 min,
extension for 3 min, 40 cycles), and G2homB3 with Q
annealing at 55°C for 30 s, extension for 2 min, 30 cycles). The 3Ј RACE
of Ifi75 was done with the primer combination G2homF1/Q
55°C, extension for 3 min for 1 min, 30 cycles) and subsequently with
(annealing at 52°C for 30 s, extension for 3 min, 30 cycles).
Genomic DNA harboring the Csprs gene of M. caroli was amplified
with primers G2F1/G2B1 at 55°C annealing temperature for 15 s, exten-
sion for 5 min and 35 cycels. Genomic DNA containing the 5Ј and middle
part of Ifi75 from M. caroli was amplified by nested PCR with primers
G3F1/G2homB1 and subsequently G3F2/G2homB3, whereas the remain-
der of the gene was amplified in a single PCR with primers G2homF1/
G3B1 (all under the same conditions: annealing at 60°C for 15 s, extension
for 15 min, 30 cycles). Genomic M. caroli-DNA covering Sp100-rs exon
4 and flanking sequences was amplified with primers Rsint3F1/Rsint3B1 at
Correspondence to: Dieter Weichenhan; E-mail: firstname.lastname@example.org-
The nucleotide sequence data reported in this paper have been submitted to
the EMBL Data base and have been assigned the accession numbers
AJ295705–AJ295708, AJ401311, AJ401358–AJ401371, AJ401375,
Mammalian Genome 12, 590–594 (2001).
© Springer-Verlag New York Inc. 2001