A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes

被引:18
|
作者
Goodwin, Sarah [1 ]
机构
[1] Georgia State Univ, Atlanta, GA 30303 USA
关键词
Second language writing assessment; Many-Facet Rasch measurement; L2 writing raters; Factors affecting writing scores; Rater variability; PERFORMANCE; QUALITY;
D O I
10.1016/j.asw.2016.07.004
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Second language (L2) writing researchers have noted that various rater and scoring variables may affect ratings assigned by human raters (Cumming, 1990; Vaughan, 1991; Weigle, 1994,1998, 2002; Cumming, Kantor, & Powers, 2001; Lumley, 2002; Barkaoui, 2010). Contrast effects (Daly & Dickson-Markman, 1982; Hales & Tokar, 1975; Hughes, Keeling, & Tuck, 1983), or how previous scores impact later ratings, may also color raters' judgments of writing quality. However, little is known about how raters use the same rubric for different examinee groups. The present paper concerns an integrated reading and writing test of academic English used at a U.S. university for both admissions and placement purposes. Raters are trained to interpret the analytic scoring rubric similarly no matter which test type is scored. Using Many-Facet Rasch measurement (Linacre, 1989/1994), I analyzed scores over seven semesters, examining rater behavior on two test types (admissions or placement). Results indicated that, of 25 raters, five raters showed six instances of statistically significant bias on admissions or placement tests. The findings suggest that raters may be attributing scores to a wider range of writing ability levels on admissions than on placement tests. Implications for assessment, rater perceptions, and small-scale academic testing programs are discussed. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:21 / 31
页数:11
相关论文
共 11 条