Statistical description of interrater variability in ordinal ratings

被引:69
|
作者
Nelson, JC [1 ]
Pepe, MS [1 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
D O I
10.1191/096228000701555262
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Ordinal categorical assessments are common in medical practice and in research. Variability in such measurements amongst raters making the assessments can be problematic. In this paper we consider how such variability can be described statistically. We review three current approaches, including kappa-type statistics, loglinear models for agreement, and latent class agreement models, and discuss their limitations. We present a new graphical approach to describing interrater variability that involves a simple frequency distribution display of the category probabilities. The method enables description of interrater variability when raters are a random sample from some population as opposed to the traditional setting in which only a few selected raters provide assessments. Advantages of this approach relative to current approaches include the following: (1) it provides a simple visual summary of the rating data, (2) description is closely linked to familiar methods for describing variability in continuous measurements, (3) interpretation is straightforward, and (4) a large sample of raters can be accommodated with ease. We illustrate the method on simulated ordinal data representing radiologists' ratings of mammography images and on rating data from a national image reading study of mammography screening.
引用
收藏
页码:475 / 496
页数:22
相关论文
共 50 条