Maintaining Equivalent Cut Scores for Small Sample Test Forms

被引：8

作者：

Dwyer, Andrew C. ^{[1
]}

机构：

[1] Amer Board Pediat Inc, Psychometr, 111 Silver Cedar Court, Chapel Hill, NC 27514 USA

来源：

JOURNAL OF EDUCATIONAL MEASUREMENT | 2016年 / 53卷 / 01期

关键词：

D O I：

10.1111/jedm.12098

中图分类号：

G44 [教育心理学];

学科分类号：

0402 ; 040202 ;

摘要：

This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for systematic differences between standard setting panels) has received almost no attention in the literature. Identity equating was also examined to provide context. Data from a standard setting form of a large national certification test (N examinees = 4,397; N panelists = 13) were split into content-equivalent subforms with common items, and resampling methodology was used to investigate the error introduced by each approach. Common-item equating (circle-arc and nominal weights mean) was evaluated at samples of size 10, 25, 50, and 100. The standard setting approaches (resetting and rescaling the standard) were evaluated by resampling (N = 8) and by simulating panelists (N = 8, 13, and 20). Results were inconclusive regarding the relative effectiveness of resetting and rescaling the standard. Small-sample equating, however, consistently produced new form cut scores that were less biased and less prone to random error than new form cut scores based on resetting or rescaling the standard.

引用

页码：3 / 22

页数：20

共 50 条

[1] SMALL SAMPLE PROPERTIES OF ALTERNATIVE FORMS OF THE LAGRANGE MULTIPLIER TEST
DAVIDSON, R
MACKINNON, JG
[J]. ECONOMICS LETTERS, 1983, 12 (3-4) : 269 - 275
[2] Data Fusion of Small Sample Flying Test Data and Big Sample Simulation Test Data Based on Equivalent Sample for Equipment Efficiency Evaluation
Ning, Xiaolei
Wu, Yingxia
Zhang, Hailin
Zhao, Xin
[J]. THEORY, METHODOLOGY, TOOLS AND APPLICATIONS FOR MODELING AND SIMULATION OF COMPLEX SYSTEMS, PT IV, 2016, 646 : 543 - 552
[3] EQUIVALENT FORMS OF THE BOSTON NAMING TEST
HUFF, FJ
COLLINS, C
CORKIN, S
ROSEN, TJ
[J]. JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY, 1986, 8 (05) : 556 - 562
[4] Guidance accuracy evaluation method of field small sample test based on equivalent sample and real simulation fusion
基于等效样本和实仿融合的现场小子样试验制导精度评估方法
[J]. Wu, Yingxia (yannanfei21@163.com), 1600, Editorial Department of Journal of Chinese Inertial Technology (29): : 264 - 272
[5] A modification to angoff and bookmarking cut scores to account for the imperfect reliability of test scores
MacCann, Robert G.
[J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2008, 68 (02) : 197 - 214
[6] FUTURE EVENTS TEST - EQUIVALENT FORMS AND CRIMINALITY
BLACK, WAM
BENNETT, P
WARDS, AR
[J]. PERCEPTUAL AND MOTOR SKILLS, 1981, 52 (01) : 277 - 278
[7] The ''milk jug test'' in the AAHPERD functional fitness test for older adult: Are scores equivalent to scores with hand weights?
Frye, PA
[J]. JOURNAL OF AGING AND PHYSICAL ACTIVITY, 1997, 5 (04) : 368 - 369
[8] Phonemic Word Generation Effort Cut Scores in a Civil Forensic Sample
Davis, J.
McHugh, T.
Axelrod, B.
Hanks, R.
[J]. ARCHIVES OF CLINICAL NEUROPSYCHOLOGY, 2011, 26 (06) : 560 - 560
[9] SMALL SAMPLE EFFICIENCY FOR 1 SAMPLE WILCOXON AND NORMAL SCORES TESTS
KLOTZ, J
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1963, 58 (302) : 556 - &
[10] INDIVIDUALLY ADMINISTERED INTELLIGENCE-TEST SCORES - EQUIVALENT OR COMPARABLE
ROGERS, WT
HOLMES, BJ
[J]. ALBERTA JOURNAL OF EDUCATIONAL RESEARCH, 1987, 33 (01): : 2 - 20

← 1 2 3 4 5 →