Investigating Gender-biased Items in a High-stakes Language Proficiency Test: Using the Rasch Model Measurement

被引：1

作者：

Bordbar, Soodeh ^{[1
]}

Alavi, Seyyed Mohammad ^{[2
]}

机构：

[1] Iran Univ Med Sci, Dept English Language, Tehran, Iran

[2] Univ Tehran, Dept English & Foreign Languages, Tehran, Iran

来源：

APPLIED LINGUISTICS RESEARCH JOURNAL | 2020年 / 4卷 / 05期

关键词：

Differential Item Functioning analysis; Bias; Dimensionality; Fairness; The Rasch model; VALIDITY; POLICY;

D O I：

10.14744/alrj.2020.73645

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

The consequential aspect of validity interprets the real and potential consequences of a test score, particularly when it comes to sources of invalidity related to the conceptions of fairness, bias, injustice, and inequity. Differential Item Functioning (DIF) analyzes the test items to evaluate test fairness and validity of educational tests. Besides, gender is mentioned as one of the elements that frequently acts as a source of construct-irrelevant variance. If gender imposes a large influence on the test items, it will bring about bias. In an attempt to explore validity and DIF analysis, the present study explores the validity of a high-stakes test and considers the role of gender as a source of bias in different subtests of language proficiency tests. To achieve this, the Rasch model was used to inspect biased items and to examine the construct-irrelevant factors. To obtain DIF analysis, the Rasch model was run to 5000 participants who were selected randomly from a pool of examinees taking part in the National University Entrance Exam for Foreign Languages (NUEEFL) as a university entrance requirement for English language studies (i.e., English literature, Teaching, and Translation). The findings reveal that the test scores are not free from construct-irrelevant variance and some misfit items were modified based on the fit statistics suggestions. By and large, the fairness of the NUEEFL was not confirmed. The results obtained from such psychometric assessment could be beneficial for test designers, stake-holders, administrators, as well as teachers. It also recommends the future administering standard and bias-free test and instructional materials.

引用

页码：1 / 21

页数：21

共 14 条

[11] University students’ perceptions towards using exemplars dialogically to develop evaluative judgement: the case of a high-stakes language test
Sin Wang Chong
[J]. Asian-Pacific Journal of Second and Foreign Language Education, 6
[12] University students' perceptions towards using exemplars dialogically to develop evaluative judgement: the case of a high-stakes language test
Chong, Sin Wang
[J]. ASIAN-PACIFIC JOURNAL OF SECOND AND FOREIGN LANGUAGE EDUCATION, 2021, 6 (01)
[13] Investigating the psychometric properties of the Qiyas for L1 Arabic language test using a Rasch measurement framework
Al-Owidha A.A.
[J]. Language Testing in Asia, 8 (1)
[14] Measurement Invariance and Differential Item Functioning Across Gender Within a Latent Class Analysis Framework: Evidence From a High-Stakes Test for University Admission in Saudi Arabia
Tsaousis, Ioannis
Sideridis, Georgios D.
AlGhamdi, Hanan M.
[J]. FRONTIERS IN PSYCHOLOGY, 2020, 11

← 1 2 →