Examining severity and centrality effects in TestDaF writing and speaking assessments: An extended Bayesian many-facet Rasch analysis

被引：7

作者：

Eckes, Thomas ^{[1
]}

Jin, Kuan-Yu ^{[2
]}

机构：

[1] Univ Bochum, TestDaF Inst, Univ Str 134, D-44799 Bochum, Germany

[2] Hong Kong Examinat & Assessment Author, Hong Kong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF TESTING | 2021年 / 21卷 / 3-4期

关键词：

Rater effects; rater centrality; facets models; performance assessment; Bayesian statistics; MCMC estimation; RATER TYPES; MODEL; QUALITY;

D O I：

10.1080/15305058.2021.1963260

中图分类号：

C [社会科学总论];

学科分类号：

03 ; 0303 ;

摘要：

Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing and speaking assessments using Bayesian MCMC methods. The findings revealed that (a) the extended facets model had a better data-model fit than models that ignored either or both kinds of rater effects, (b) rating scale and partial credit versions of the extended model differed in terms of data-model fit for writing and speaking, (c) rater severity and centrality estimates were not significantly correlated with each other, and (d) centrality effects had a demonstrable impact on examinee rank orderings. The discussion focuses on implications for the analysis and evaluation of rating quality in performance assessments.

引用

页码：131 / 153

页数：23

共 9 条

[1] A Bayesian many-facet Rasch model with Markov modeling for rater severity drift
Uto, Masaki
BEHAVIOR RESEARCH METHODS, 2023, 55 (07) : 3910 - 3928
[2] A Bayesian many-facet Rasch model with Markov modeling for rater severity drift
Masaki Uto
Behavior Research Methods, 2023, 55 : 3910 - 3928
[3] Measuring the Impact of Peer Interaction in Group Oral Assessments with an Extended Many-Facet Rasch Model
Jin, Kuan-Yu
Eckes, Thomas
JOURNAL OF EDUCATIONAL MEASUREMENT, 2024, 61 (01) : 47 - 68
[4] Analysis of Peer and Self-Assessments Using the Many-facet Rasch Measurement Model and Student Opinions
Demir, Seda
JOURNAL OF MEASUREMENT AND EVALUATION IN EDUCATION AND PSYCHOLOGY-EPOD, 2023, 14 (03): : 266 - 286
[5] Effects of rating criteria order on the halo effect in L2 writing assessment: a many-facet Rasch measurement analysis
Hyunwoo Kim
Language Testing in Asia, 10
[6] Effects of rating criteria order on the halo effect in L2 writing assessment: a many-facet Rasch measurement analysis
Kim, Hyunwoo
LANGUAGE TESTING IN ASIA, 2020, 10 (1)
[7] Rater agreement and rater severity: A many-faceted Rasch analysis of performance assessments in the "Test Deutsch als Fremdsprache" (TestDaF)
Eckes, T
DIAGNOSTICA, 2004, 50 (02): : 65 - 77
[8] A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes
Goodwin, Sarah
ASSESSING WRITING, 2016, 30 : 21 - 31
[9] The Effect of Rubric on Rater's Severity and Bias in TVET Laboratory Practice Assessment: Analysis using Many-Facet Rasch Measurement
Ab Rahman, Azmanirah
Hanafi, Nurfirdawati Muhamad
Yusof, Yusmarwati
Mukhtar, Marina Ibrahim
Awang, Halizah
Yusof, Anizam Mohamed
JOURNAL OF TECHNICAL EDUCATION AND TRAINING, 2020, 12 (01): : 57 - 67

← 1 →