Quality & Quantity 34: 353–365, 2000.
© 2000 Kluwer Academic Publishers. Printed in the Netherlands.
Reliability of Assessment of
Reading Ability in Three Countries
DOUGAL HUTCHISON, LESLEY KENDALL, DAVID BARTHOLOMEW,
MARTIN KNOTT, JANE GALBRAITH and MARIA PICCOLI
National Foundation for Educational Research in England and Wales, The Mere, Upton Park,
Slough SL1 2DQ, U.K., e-mail: D.Hutchison@NFER.AC.UK
Abstract. Over the last few decades, there have been a number of national and international
programmes to assess performance in key subjects. For example the International Association for
Evaluation of Educational Attainment (IEA), International Assessment of Educational Progress
(IAEP), and the Third International Mathematics and Science Survey (TIMSS) programs have aimed
to compare performance between participating countries.
Such exercises have different requirements from those tests, such as public examinations, which
aim to assess the attainment of individuals. National and international assessments aim to cover a
very wide range of materials, and often use the technique of multiple matrix sampling to do so.
The investigation is based on generalizability-type analyses of three national data sets at two
ages, and looks not only at the variance components arising from sampling of schools, and of pupils
within them, but also at the variation between different assessment instruments, and between items
within assessment instruments. If interpreted with care, such results can be of value in the design
of future studies. This paper concentrates largely on the precision of estimates of overall means, but
analyses of the type described could also be used to compare the performance of subpopulations.
Key words: reliability, generalizability, international comparisons.
Over the last few decades, there have been a number of programmes to assess
national performance in key subjects, especially ﬁrst and other languages, math-
ematics and science. Internationally the IEA and IAEP programs have aimed to
compare performance between participating countries, and there has recently been
great interest in the results of the Third International Study of Maths and Science
(TIMSS) (Keys et al., 1996, 1997). Nationally there have been programmes in the
United States, the National Assessment of Educational Progress (NAEP) (Mullis
et al., 1994), the Canada School Achievement Indicators Program (SAIP) (SAIP,
1993), the Assessment of Performance Unit (APU) in England, Wales and Northern
Ireland (see, e.g., Foxman et al., 1991), the Assessment of Achievement Pro-
gramme in Scotland (AAP, 1993). Generally these studies relate to school pupils,
but adult literacy is being increasingly investigated (Murray et al., 1998).