ISSN 10227954, Russian Journal of Genetics, 2010, Vol. 46, No. 4, pp. 394–403. © Pleiades Publishing, Inc., 2010.
Original Russian Text © V.V. Suslov, P.M. Ponomarenko, M.P. Ponomarenko, I.A. Drachkova, T.V. Arshinova, L.K. Savinkova, N.A. Kolchanov, 2010, published in Genetika,
2010, Vol. 46, No. 4, pp. 448–457.
394
INTRODUCTION
Realization of genomewide projects for the first
time made it possible to quantitatively evaluate the
scale of genetic variation. For instance, the first
sequenced individual human genomes, the genome of
James D. Watson and that of J. Craig Venter, differed
from the reference human genome in more than
3
×
10
6
single nucleotide polymorphisms (SNPs),
while the number of shared polymorphisms reached
1.68
×
10
6
[1]. A similar pattern was revealed upon
comparison of the reference genome with the genome
of a representative of the Mongoloid race [2]. Analo
gous comparisons between the genomes of other
eukaryotic species give different estimates, detecting
on average one SNP per 50 to 1000 nucleotide pairs.
This gives estimate of 3 to 10 million SNPs per genome
(for both coding and noncoding DNA sequences) [3–
13]. The search for SNPs that are common for related
species enabled evaluation of approximate polymor
phism lifetime. For instance, comparison of X chro
mosomes from human and orangutan revealed no
common SNPs. This means that from the moment of
divergence of these species, the SNPs of the common
ancestor were either eliminated, or became species
specific [14]. Thus, molecular genetic material proved
to be very flexible in terms of evolution. High satura
tion of the genomes by polymorphisms enables rather
easy association of certain polymorphism with certain
mutant phenotype. On the other hand, the great num
ber of polymorphisms makes experimental analysis of
molecular genetic mechanisms of the effect of such
polymorphisms on phenotype within reasonable time
rather unlikely.
Thus, one of the main tasks of postgenomic bioin
formatics is providing the geneticists with experimen
tal computer test systems that would enable
in silico
prognosis of the polymorphism effect on the genotype
based on a small number of experiments. Construction
of such test systems for polymorphisms located in
noncoding DNA regions is particularly difficult. The
absolute majority of the 11 million experimentally
TATA Box Polymorphisms in Genes of Commercial and Laboratory
Animals and Plants Associated with Selectively Valuable Traits
V. V. Suslov
a
, P. M. Ponomarenko
b
, M. P. Ponomarenko
a
, I. A. Drachkova
a
,
T. V. Arshinova
a
, L. K. Savinkova
a
, and N. A. Kolchanov
a,c
a
Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk, 630090 Russia;
email: icgadm@bionet.nsc.ru
b
Department of Chemical Physics and Biophysics, Novosibirsk State University, Novosibirsk, 630090 Russia;
email: nauka@nsu.ru
c
Department of Information Biology, Novosibirsk State University, Novosibirsk, 630090 Russia;
email: bionet.nsc.ru/chair/cib
Received October 7, 2009
Abstract
—Most of more than 11 million experimentally established polymorphisms, accumulated in dbSNP,
were identified in the intergenic spacers or coding DNA regions. This fact enables interpretation of the former
polymorphisms as neutral, while the latter makes clear the biological sense of the associated mutant pheno
types, “the defect of certain proteins”. The association of polymorphisms in regulatory DNA regions with
mutant phenotypes is poorly studied. Specifically, the defects in certain DNA/protein binding sites were
identified in less than 500 cases. In TATAcontaining genes of eukaryotes the TATA box, the TBP (TATA
binding protein) binding site, is located about 30 bp upstream from the transcription start site. Interaction
between DNA and TBP triggers assemblage of the preinitiation complex. For 38 TATA box polymorphisms
in the genes of commercial and laboratory animals and plants, the effect on TBPbinding activity was evalu
ated using the equilibrium equation for the four subsequent steps of TBP/TATA box binding (nonspecific
binding sliding recognition stabilization). According to the GenBank data, these 38 polymor
phisms were associated with the change in a number of selectively valuable traits. Statistically significant con
gruence of
in silico
analysis performed with mutant phenotypes (
α
< 0.05, binomial law) provides suggestion
of the mechanism of phenotypic manifestation of these polymorphisms (changing of the TBPbinding activ
ity), as well a validates the possibility of developing the universal test system for experimental–computer pre
diction of the effects of TATA box mutations in specified genes on selectively valuable traits of the species,
varieties, and breeds.
DOI: 10.1134/S1022795410040022
MOLECULAR GENETICS