ISSN 10227954, Russian Journal of Genetics, 2010, Vol. 46, No. 9, pp. 1080–1083. © Pleiades Publishing, Inc., 2010.
Original Russian Text © A.S. Komissarov, I.S. Kuznetsova, O.I. Podgornaya, 2010, published in Genetika, 2010, Vol. 46, No. 9, pp. 1217–1221.
Genomic databases contain chromosomal maps
consisting only of euchromatic regions. The signifi
cant and apparently important part of genomes
remains not sequenced, unmapped, nonanalyzed.
These essential chromosome parts are centromeric
(CEN) and telomeric regions. The Whole Genome
Shotgun (WGS) mouse database comprises both
euchromatic and heterochromatic parts of chromo
somes. A significant number of satellite DNA
(satDNA) in the genome, characteristic of hetero
chromatic regions, provided their random reuptake.
We conducted association analysis of minor satellite
(miSAT) CEN and pericentromeric (periCEN) major
satellite (MaSat) in the WGS database and started the
reliance checking of in silico predictions. All MiSat
and MaSatcontaining sequences in the WGS data
base were detected using the BLAST algorithm .
The known repeats were detected using the Censor
software  on the basis of the REPBASE database
. The TRF software was used for searching for of
tandem repeats .
The monotonous MiSat and MaSat arrays, as well
as MiSat transition to MaSat were found, whereas not
a single transition to a telomeric repeat was observed.
Two association classes proved to be of most interest:
(1) MiSat with ORF2 and 5'LTR regions of IAP and
LINE retroelements and (2) MaSat with tandem
repeats with 21 and 31bp monomers.
The present report deals with the first association
class characteristic for MiSat CEN. Results of WGS
analysis are demonstrated in the table. The WGS data
base contains both separate short sequences and con
tigs. We selected 8453 sequences containing al least
one consensus MiSat sequence. Only 27 of them are
longer than 10 kb, while 66 of them are 3–10 kb in
length. The selected MiSat mainly contain short
sequences (1300 with 1–3 kb in length and 7061 less
than 1 kb in length, table).
The major part of the selected sequences (8180)
comprises the monotonous MiSat array absent from
sequences longer than 5 kb. Analysis of the presence of
highorder repeats (HOR) was carried out for the
monotonous arrays longer than 1 kb using dnadot pro
cedure . None of the MiSat array was characterized
by the presence of HOR compared to MaSat array that
in turn confirms high conservatism of the MiSat con
sensus demonstrated previously in the pregenomic
epoch . The difference in the distinct sequences of
the sequenced MiSat is the necessary condition for
contigs construction. Probably, their absence caused
the failure to detect long MiSat contigs.
Mouse Centromeric Tandem Repeats in silico and in situ
A. S. Komissarov, I. S. Kuznetsova, and O. I. Podgornaya
Institute of Cytology, Russian Academy of Sciences, St. Petersburg, 194064 Russia;
email: firstname.lastname@example.org, email@example.com.
Received January 28, 2010
—The search for all sequences containing centromeric (CEN) minor satellite (MiSat) or pericen
tromeric (periCEN) mouse major satellite (MaSat) was conducted in the whole genome shotgun (WGS)
database. The sequences were checked for the presence of the known dispersed repeats using the Censor soft
ware. The presence of tandem repeats was tested using Tandem Repeat Finder (TRF). Monotonous MiSat
and MaSat arrays and MaSat to MiSat array transitions were detected. Moreover, two other types of contacts
were revealed: (1) MiSat transition to fragments of retroelements LINE and IAP (ERV family, intracisternal
Atype particles), mainly to ORF2 and 5'LTR containing elements; (2) MaSat transition to two tandem
repeats with monomers 21 bp and 31 bp in size. The presence of the MiSat/IAP transition could be checked
experimentally. The common DNA motif among the IAP fragments close to MiSat was isolated. IAPspecific
primers were constructed and the fragments obtained in PCR with IAP and MiSat primers compiled the plas
mid vector library. Clone n51 with the maximum length of the possible insertion (~no. 800 bp) was selected
from the library. FISH on extended chromatin fibers (fiberFISH) carried out on the n51 clone demonstrated
that the main signal definitely belonged to CEN. However, the signals on the chromosome arms were also
detected that could be due to the partial homology of n51 to the dispersed repeats. The duplicated fiberFISH
with MiSat and n51 allowed to measure the distances between the fragments. The previously obtained MS3
sequence has some homology to IAP and CEN localization. Accordingly, the regular associations of MiSat
with IAP retroelements were shown in silico and in situ. Together with the published data, the present findings
suggest that retroelements or their fragments may be essential components of the normal centromere of