The generalizability of ratings of item relevance

被引:2
|
作者
Norcini, J [1 ]
Grosso, L [1 ]
机构
[1] Amer Board Internal Med, Philadelphia, PA 19106 USA
关键词
D O I
10.1207/s15324818ame1104_1
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
The relevance of test content to practice is essential for credentialing examinations and one way to ensure it is to collect ratings of item relevance from job incumbents. This study analyzed ratings of the 132 single-best-answer items and 117 multiple true-false item sets that formed the pretest books in a single administration of a medical certifying examination. Ratings collected from 57 practitioners were high (an average of more than 4 on a 5-point scale) and correlated with item difficulty (r =.31 to .34). The relationship between ratings and item discrimination is less clear (r = -.04 to .31). Application of generalizability theory to the ratings shows that reasonable estimates of item, stem, and total test relevance can be obtained with about 10 raters.
引用
收藏
页码:301 / 309
页数:9
相关论文
共 50 条