Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Proposed Solution to the Base Rate Problem in the Kappa Statistic

A Proposed Solution to the Base Rate Problem in the Kappa Statistic Abstract • Because it corrects for chance agreement, kappa (K) is a useful statistic for calculating interrater concordance. However, K has been criticized because its computed value is a function not only of sensitivity and specificity, but also the prevalence, or base rate, of the illness of interest in the particular population under study. For example, it has been shown for a hypothetical case in which sensitivity and specificity remain constant at .95 each, that K falls from .81 to .14 when the prevalence drops from 50% to 1%. Thus, differing values of K may be entirely due to differences in prevalence. Calculation of agreement presents different problems depending on whether one is studying reliability or validity. We discuss quantification of agreement in the pure validity case, the pure reliability case, and those studies that fall somewhere between. As a way of minimizing the base rate problem, we propose a statistic for the quantification of agreement (the Y statistic), which can be related to K but which is completely independent of prevalence in the case of validity studies and relatively so in the case of reliability. References 1. Cohen J: A coefficient of agreement for nominal scales . Educ Psychol Measurement 1960;20:37-46.Crossref 2. Helzer JE, Robins LN, Taibleson M, et al: Reliability of psychiatric diagnosis: I. A methodological review . Arch Gen Psychiatry 1977;34:129-133.Crossref 3. Grove WM, Andreason NC, McDonald-Scott P, et al: Reliability studies of psychiatric diagnosis . Arch Gen Psychiatry 1981;38:408-413.Crossref 4. Carey G, Gottesman II: Reliability and validity in binary ratings: Areas of common misunderstanding in diagnosis and symptom ratings . Arch Gen Psychiatry 1978;35:1454-1459.Crossref 5. Maxwell AE: Coefficients of agreement between observers and their interpretation . Br J Psychiatry 1977;130:79-83.Crossref 6. Yule GU: On the methods of measuring association between two attributes . J Roy Statist Soc 1912;75:581-642.Crossref 7. Bishop YMM, Fienberg SE, Holland PW: Discrete Multivariate Analysis: Theory and Practice . Cambridge, Mass, MIT Press, 1975. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of General Psychiatry American Medical Association

A Proposed Solution to the Base Rate Problem in the Kappa Statistic

Loading next page...
 
/lp/american-medical-association/a-proposed-solution-to-the-base-rate-problem-in-the-kappa-statistic-aRk0lyvDjn
Publisher
American Medical Association
Copyright
Copyright © 1985 American Medical Association. All Rights Reserved.
ISSN
0003-990X
eISSN
1598-3636
DOI
10.1001/archpsyc.1985.01790300093012
Publisher site
See Article on Publisher Site

Abstract

Abstract • Because it corrects for chance agreement, kappa (K) is a useful statistic for calculating interrater concordance. However, K has been criticized because its computed value is a function not only of sensitivity and specificity, but also the prevalence, or base rate, of the illness of interest in the particular population under study. For example, it has been shown for a hypothetical case in which sensitivity and specificity remain constant at .95 each, that K falls from .81 to .14 when the prevalence drops from 50% to 1%. Thus, differing values of K may be entirely due to differences in prevalence. Calculation of agreement presents different problems depending on whether one is studying reliability or validity. We discuss quantification of agreement in the pure validity case, the pure reliability case, and those studies that fall somewhere between. As a way of minimizing the base rate problem, we propose a statistic for the quantification of agreement (the Y statistic), which can be related to K but which is completely independent of prevalence in the case of validity studies and relatively so in the case of reliability. References 1. Cohen J: A coefficient of agreement for nominal scales . Educ Psychol Measurement 1960;20:37-46.Crossref 2. Helzer JE, Robins LN, Taibleson M, et al: Reliability of psychiatric diagnosis: I. A methodological review . Arch Gen Psychiatry 1977;34:129-133.Crossref 3. Grove WM, Andreason NC, McDonald-Scott P, et al: Reliability studies of psychiatric diagnosis . Arch Gen Psychiatry 1981;38:408-413.Crossref 4. Carey G, Gottesman II: Reliability and validity in binary ratings: Areas of common misunderstanding in diagnosis and symptom ratings . Arch Gen Psychiatry 1978;35:1454-1459.Crossref 5. Maxwell AE: Coefficients of agreement between observers and their interpretation . Br J Psychiatry 1977;130:79-83.Crossref 6. Yule GU: On the methods of measuring association between two attributes . J Roy Statist Soc 1912;75:581-642.Crossref 7. Bishop YMM, Fienberg SE, Holland PW: Discrete Multivariate Analysis: Theory and Practice . Cambridge, Mass, MIT Press, 1975.

Journal

Archives of General PsychiatryAmerican Medical Association

Published: Jul 1, 1985

References