1022-7954/05/4102- © 2005 Pleiades Publishing, Inc.
Russian Journal of Genetics, Vol. 41, No. 2, 2005, pp. 202–210. Translated from Genetika, Vol. 41, No. 2, 2005, pp. 269–278.
Original Russian Text Copyright © 2005 by Ruanet, Kochieva, Ryzhova.
Identiﬁcation is an important issue in genetic,
molecular biological, and phylogenetic studies. Molec-
ular genetic markers are an important tool for the solu-
tion of this problem. Polymorphic DNA sequences are
increasingly widely used, along with genetic and bio-
chemical markers, for the analysis of plant genomes.
DNA markers are also used for genotyping individual
plants, cultivars, populations, and species. Molecular
marking methods based on polymerase chain reaction
(PCR) followed by the separation of ampliﬁed genomic
fragments by electrophoresis in agarose or polyacryla-
mide gel are now the most widespread. Today, methods
of multilocus analysis, such as random ampliﬁed poly-
morphic DNA (RAPD), inter-simple sequence repeat
(ISSR), and ampliﬁed fragment length polymorphism
(AFLP) analyses, are most often used for estimating
genomic polymorphism and determining phylogenetic
relationships [1–3]. The analysis of the spectra of
ampliﬁed DNAs is one of the most labor-consuming
stages of RAPD, ISSR, and AFLP marking. This stage,
which includes analysis of photographs or scanned gel
images, not only takes much time, but also is error-
prone because of the subjective visual estimation or the
errors that occur when documenting the results of
analysis. This is related, ﬁrst, to the formalization of
graphical information and, second, to the difﬁculties
arising when applying statistical software and parame-
ters to the analysis of data that have already been for-
malized . Artiﬁcial neural networks (ANNs) are a
new informational technology intended for solving
complex nonlinear problems, including the treatment
of information in the form of images. To date, ANNs
are considered ideal “image processors” .
A characteristic feature of ANN technologies is the
possibility of solving non-formalized problems for
which no algorithms have been so far developed for
some reason. ANN technologies offer a comparatively
simple procedure for creating these algorithms through
learning. The algorithms are formed automatically in
the course of learning by examples. This substantially
simpliﬁes the preliminary formalization of information.
Some advantages, such as the possibility of nonlinear
simulation and a relatively simple implementation,
often make them indispensable when solving complex
multivariate problems, e.g., when dealing with images
(in this case, photographs of electrophoretic spectra).
They are especially effective in exploratory data analy-
sis, when it is necessary to determine whether the vari-
ables studied are related with one another at all.
Several types of ANNs have been developed to date.
Of these, the multilayered perceptron and the self-orga-
nizing feature map (SOFM) are the most popular [6, 7].
The choice of the type of ANN depends on the problem
to be solved. The multilayered perceptron is usually
used for prediction and classiﬁcation; and the SOFM,
for data categorization. The SOFM uses a database to
“learn,” after which it can construct a two-dimensional
The Use of a Self-Organizing Feature Map for the Treatment
of the Results of RAPD and ISSR Analyses in Studies
on the Genomic Polymorphism in the Genus
V. V. Ruanet, E. Z. Kochieva, and N. N. Ryzhova
Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991 Russia;
Received April 22, 2004
—The results of studies based on multilocus molecular analyses, including random ampliﬁed poly-
morphic DNA (RAPD), inter-simple sequence repeat (ISSR), and ampliﬁed fragment length polymorphism
(AFLP) analyses, are usually presented in the form of images (electrophoregrams, photographs, etc.). The inter-
pretation of this information is complicated, labor-consuming, and subjective. Artiﬁcial neural networks
(ANNs), which are ideal “image processors,” may be useful when solving such tasks. The possibility of using
ANNs for the treatment of the results of RAPD and ISSR analyses has been studied. The RAPD and ISSR frag-
ment spectra of the genus
L. (peppers) were used in this study. The results of clustering the acces-
sions studied by means of the unweighted pair-group method with arithmetic averages (UPGMA), which is
often used for phylogenetic constructions based on RAPD and ISSR data, serve as expert estimates. Fundamen-
tally new methods of genetic polymorphism estimation using ANN technologies, namely, self-organizing fea-
ture maps (SOFMs) have been developed. The results show that the clusters obtained with the use of UPGMA
and SOFM coincide by more than 90%; taking into account that ANNs can deal with high noise levels and
incomplete or contradictory data, the approach proposed may prove to be efﬁcient.