Quality & Quantity 38: 787–800, 2004.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
Measuring the Reliability of Qualitative Text
The Annenberg School for Communication, University of Pennsylvania, 3620 Walnut Street,
Philadelphia, PA 19104.6220, USA, E-mail: email@example.com
Abstract. This paper reports a new tool for assessing the reliability of text interpretations heretofore
unavailable to qualitative research. It responds to a combination of two challenges, the problem
of assessing the reliability of multiple interpretations – a solution to this problem was anticipated
earlier (Krippendorff, 1992) but not fully developed – and the problem of identifying units of analysis
within a continuum of text and similar representations (Krippendorff, 1995). The paper sketches the
family of α-coefﬁcients, which this paper extends, and then describes its new arrival. A computational
example is included in the Appendix.
Key words: reliability, qualitative, text analysis, unitizing, multiple interpretations, Krippendorff’s
1. The Family of Alpha-Agreement Measures
In the last thirty some years α (alpha) has developed from a simple generalization
of several agreement coefﬁcients for two coders, notably Scott’s (1956) π (pi)
for nominal data, Spearman’s ρ (rho) (Siegel, 1956: 202–213) for ordinal data,
and Pearson’s (1901) and Tildesley’s (1921) intraclass correlation r
data into a whole family of agreement coefﬁcients (Krippendorff, 1970, 1972,
1980, 1995, 2004). This development opened a space for consistent reliability
• Any number of observers or coders, not just two.
• Incomplete data (unoccupied cells in a reliability data matrix).
• Small sample sizes, for which it corrects.
• Data with any kind of metric: nominal, ordinal, interval, ratio, but also circular,
polar, and specialized kinds.
• Partitions and subsets of units of analysis, including individual units.
• Situations in which data are unitized, not just coded. Coding of interval data
has dominated the literature.
• Multi-valued data, that is, multiple interpretations of single units of analysis,
not just single-valued data.
α enables various analyses, for example, calculating: