Quality & Quantity 32: 201–211, 1998.
© 1998 Kluwer Academic Publishers. Printed in the Netherlands.
The Effect of Prior Probability on Skill in
Two-Group Discriminant Analysis
Department of Communication, University of Oklahoma, Norman, OK 73019, U.S.A.
Abstract. Although the weights in a discriminant function (both linear and quadratic) are indepen-
dent of group prior probabilities, the performance of the classiﬁer (on the training and validation
data) is sensitively dependent on these often unknown probabilities. After reviewing some defects of
a popular measure of performance in the situation where the group sizes are naturally disproportion-
ate, three alternative measures of performance (or association) are considered and it is shown that
the behavior of the measures as a function of group prior probability is different between measures.
Consequently, the optimum choice of the group prior probability depends on the speciﬁc measure
of performance. Among the measures considered, only two measures - the index of mean square
contingency and the Heidke Skill Statistic - are found to be well deﬁned in the disparate-group size
situation, and are, therefore, recommended. An empirical data set, dealing with delinquency among
high school students is employed to illustrate all of the ﬁndings.
Key words: Discriminant Analysis, prior probability, performance.
Discriminant Analysis (DA) deals with a range of problems corresponding to the
inference-decision approach in statistical methodology (Huberty, 1994; Klecka,
1980; Lachenbruch, 1975; McLachlan, 1992). In the social sciences, one is fre-
quently interested in drawing inferences about the relationship between some set
of feature variables and group membership. Identifying the impact of each of the
feature variables on group membership is often tantamount to a complete under-
standing of the underlying system. At the other extreme, group membership is to
be decided on the basis of some measurements of the feature variables, and an un-
derstanding of the "weights" of the feature variables is only secondary, if required
at all. In this extreme, one is interested only in predicting group membership, e.g.,
whether or not a tumor is malignant, if a set of observed atmospheric conditions
is tornadic, or if a student is delinquent, etc. Most applications of DA, however,
lie somewhere in between these two limits, in that not only the contribution of
each feature variable is of interest, but also it is important to obtain a classiﬁer,
or rule, that is optimal in performing group assignment. Regardless of the need for
understanding the contribution of the feature variables, a measure of performance is
always necessary in order to assess the viability of the model as a whole. Due to its