Selection and use of SNP markers for animal identification and
paternity analysis in U.S. beef cattle
Michael P. Heaton, Gregory P. Harhay, Gary L. Bennett, Roger T. Stone, W. Michael Grosse, Eduardo Casas,
John W. Keele, Timothy P.L. Smith, Carol G. Chitko-McKown, William W. Laegreid
USDA, ARS, U.S. Meat Animal Research Center (MARC), State Spur 18D, P.O. Box 166, Clay Center, Nebraska 68933-0166, USA
Received: 21 November 2001 / Accepted: 30 January 2002
Abstract. DNA marker technology represents a promising means
for determining the genetic identity and kinship of an animal.
Compared with other types of DNA markers, single nucleotide
polymorphisms (SNPs) are attractive because they are abundant,
genetically stable, and amenable to high-throughput automated
analysis. In cattle, the challenge has been to identify a minimal set
of SNPs with sufficient power for use in a variety of popular
breeds and crossbred populations. This report describes a set of 32
highly informative SNP markers distributed among 18 autosomes
and both sex chromosomes. Informativity of these SNPs in U.S.
beef cattle populations was estimated from the distribution of al-
lele and genotype frequencies in two panels: one consisting of 96
purebred sires representing 17 popular breeds, and another with
154 purebred American Angus from six herds in four Midwestern
states. Based on frequency data from these panels, the estimated
probability that two randomly selected, unrelated individuals will
possess identical genotypes for all 32 loci was 2.0 ×10
multi-breed composite populations and 1.9 × 10
Angus populations. The probability that a randomly chosen can-
didate sire will be excluded from paternity was estimated to be
99.9% and 99.4% for the same respective populations. The DNA
immediately surrounding the 32 target SNPs was sequenced in the
96 sires of the multi-breed panel and found to contain an additional
183 polymorphic sites. Knowledge of these additional sites, to-
gether with the 32 target SNPs, allows the design of robust, accu-
rate genotype assays on a variety of high-throughput SNP geno-
A number of DNA marker types are suitable for use in identifica-
tion and kinship analyses in cattle populations. Within the last
decade, short tandem repeat markers (microsatellites) and ampli-
fied fragment length polymorphism (AFLP) markers have been
successfully used in bovine animal identification and parentage
testing (Glowatzki-Mullis et al. 1995; Usha et al. 1995; Ajmone-
Marsan et al. 1997; Heyen et al. 1997; Williams et al. 1997).
Recent advances in high-throughput DNA sequencing, computer
software, and bioinformatics have facilitated the identification of
single nucleotide polymorphism (SNP) markers from amplified
segments of genomic DNA (amplicons). SNPs are the fundamental
unit of genetic variation and attractive as markers because they are
abundant in cattle (Heaton et al. 2001b), genetically stable in mam-
mals (Markovtsova et al. 2000; Nielsen 2000; Thomson et al.
2000), and amenable to high-throughput automated analysis
(Wang et al. 1998; Lindblad-Toh et al. 2000). Several high-
throughput genotyping technologies have been developed (Kwok
2001) in an attempt to score the growing number of published
human SNPs (Sachidanandam et al. 2001). Many of these geno-
typing technologies are suitable for use in cattle. Although the
present cost of SNP genotyping is typically $1 per genotype, a
collective goal of the Human Genome Project is to deliver tech-
nology for scoring SNPs at less than a penny each (Zubritzky
1999). Widespread and extensive use of SNP genotyping in cattle
will be dependent on the success of the Human Genome Project in
achieving this goal.
In the development of efficient SNP-based marker systems, it
is critical to consider that SNP informativity may vary signifi-
cantly between populations (Krawczak 1999). A prerequisite for
using cattle SNPs in animal identification and paternity analysis is
the description of a minimal set with sufficient power to uniquely
identify individuals and their parents in a variety of popular breeds
and crossbred populations. The objectives of this study were to 1)
identify a set of bovine SNPs with significant informativity in U.S.
beef cattle, 2) estimate the potential utility of these markers in both
animal identification and paternity testing, and 3) identify the poly-
morphisms in the DNA immediately surrounding the target SNPs
to facilitate the design of accurate, low-cost SNP assays on a
variety of high-throughput genotyping platforms.
Materials and methods
Animal groups and genomic DNA samples.
Three different groups of
cattle DNAs were analyzed in the present study: 1) the MARC Beef Cattle
Diversity Panel (MBCDP) version 2.1 (Heaton et al. 2001a) was used to
sample the breadth of genetic diversity in popular U.S. breeds; 2) the
MARC Purebred Angus Panel (MPAP) version 1.0 (Heaton et al. 2001a)
was used to sample diversity present in purebred American Angus popu-
lations; and 3) the MARC reference cattle population (Bishop et al. 1994)
was used to validate SNP segregation and determine approximate marker
position on the bovine linkage map. MBCDP2.1 consists of 92 sires from
16 popular beef breeds and four sires from the Holstein dairy breed;
MPAP1.0 contains 154 purebred animals sampled from six herds in four
Midwestern states; the MARC reference cattle population contains several
popular U.S. breeds, and includes 28 founding parents, 23 grandparents,
and 262 progeny. DNA was extracted from tissues and arrayed in 96-well
plates as previously described (Heaton et al. 2001a).
Selection criteria for “highly informative” SNPs.
Two types of am-
plicons were used in the screening for SNPs. The first type were those
resulting from ongoing haplotype studies of seven specific genes [IL1B
(Maliszewski et al. 1988); PRNP (Lemaire-Vieille et al. 2000); NOS2
(C.G. Chitko-McKown, unpublished); CAPN1 (Smith et al. 2000); IL8
(Heaton et al. 2001a); ZFX and ZFY (Xiao et al. 1998; Lawson and Hewitt
2001)]. The second type of amplicons were those resulting from the MARC
bovine EST/SNP linkage mapping project (Smith et al. 2001; Stone et al.
2002). The SNP genotype frequencies of the founding parents and grand-
parents of the MARC reference population were used as a first-level screen
for informativity. When both homozygous SNP genotypes were present in
the reference population founders, the polymorphism was scored in the
Present address for W.M. Grosse: CuraGen Corporation, 322 East Main
Street, Branford, CT 06405, USA.
Correspondence to: M.P. Heaton; E-mail: firstname.lastname@example.org
Mammalian Genome 13, 272–281 (2002).
© Springer-Verlag New York Inc. 2002
Incorporating Mouse Genome