Rater variability and reliability of constructed response questions in New York state high-stakes tests of English language arts and mathematics: implications for educational assessment policy

被引:0
|
作者
Jinyan Huang
Patrick B. Whipple
机构
[1] Jiangsu University,The School of Teacher Education
[2] Jiangsu University,The Evidence
[3] The Genesee Valley Board of Cooperative Educational Services,based Research Center for Educational Assessment (ERCEA)
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Using generalizability (G-) theory as a theoretical framework and research methodology, this study examined the impact of the current one-rater holistic scoring practice on the rater variability and reliability of constructed response questions in New York State high-stakes tests of grades four and six English language arts (ELA) and grades four and five mathematics assessments. Following the New York State scoring rubrics, a total of 36 grades four and six ELA constructed response samples and 72 grades four and five mathematics constructed response samples were marked holistically by ten independent raters having current certifications as educators in the State of New York. The results indicated that the current single-rater holistic scoring practice would not be able to yield acceptable G-coefficients for the New York State grades four and six ELA and grades four and five mathematics assessments. Implications for assessment policy making at the local and state levels are discussed.
引用
收藏
相关论文
共 1 条