Quality & Quantity 34: 331–351, 2000.
© 2000 Kluwer Academic Publishers. Printed in the Netherlands.
Imputation of Missing Item Responses:
Some Simple Techniques
Department of Statistics, Measurement Theory, & Information Technology, FPPSW, University of
Groningen, Grote Kruisstraat 2/1, 9712 TS Groningen, The Netherlands, e-mail:
Abstract. Among the wide variety of procedures to handle missing data, imputing the missing
values is a popular strategy to deal with missing item responses. In this paper some simple and
easily implemented imputation techniques like item and person mean substitution, and some hot-deck
procedures, are investigated. A simulation study was performed based on responses to items forming
a scale to measure a latent trait of the respondents. The effects of different imputation procedures on
the estimation of the latent ability of the respondents were investigated, as well as the effect on the
estimation of Cronbach’s alpha (indicating the reliability of the test) and Loevinger’s H-coefﬁcient
(indicating scalability). The results indicate that procedures which use the relationships between
items perform best, although they tend to overestimate the scale quality.
Key words: missing data, mean imputation, hot-deck imputation, item response theory, simulation.
Among the wide variety of procedures to handle missing data, imputation is a
popular strategy to deal with missing item responses. With imputation procedures
estimates of the missing values are obtained which replace the blanks in the data
set. The completed data set can be analyzed with all techniques usually applied to
complete data. Imputing missing values, however, is not without danger. Dempster
and Rubin (1983) state: “The idea of imputation is both seductive and dangerous. It
is seductive because it can lull the user into the pleasurable state of believing that
the data are complete after all, and it is dangerous because it lumps together situ-
ations where the problem is sufﬁciently minor that it can be legitimately handled in
this way and situations where standard estimators applied to the real and imputed
data have substantial biases” (p. 8).
The danger comes from the possible difference between responders and non-
responders. When this difference is systematical, the results of analyses may be
biased and false conclusions are easily drawn. Despite the dangers, imputation is a
popular technique, because it allows the researcher to use standard complete-data
methods of analysis on the ﬁlled-in data. However, naive imputations may be worse
than doing nothing, so care is needed (see Little, 1988).