Semi-automatic coding of open-ended text responses in large-scale assessments

被引:6
|
作者
Andersen, Nico [1 ]
Zehner, Fabian [1 ,2 ]
Goldhammer, Frank [1 ,2 ]
机构
[1] DIPF, Leibniz Inst Res & Informat Educ, Rostocker Str 6, D-60323 Frankfurt, Germany
[2] Ctr Int Student Assessment ZIB eV, Frankfurt, Germany
关键词
clustering; eco; effort reduction; exploring coding assistant; semi-automatic coding; support human raters; AGREEMENT; TRENDS;
D O I
10.1111/jcal.12717
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Background In the context of large-scale educational assessments, the effort required to code open-ended text responses is considerably more expensive and time-consuming than the evaluation of multiple-choice responses because it requires trained personnel and long manual coding sessions. Aim Our semi-supervised coding method eco (exploring coding assistant) dynamically supports human raters by automatically coding a subset of the responses. Method We map normalized response texts into a semantic space and cluster response vectors based on their semantic similarity. Assuming that similar codes represent semantically similar responses, we propagate codes to responses in optimally homogeneous clusters. Cluster homogeneity is assessed by strategically querying informative responses and presenting them to a human rater. Following each manual coding, the method estimates the code distribution respecting a certainty interval and assumes a homogeneous distribution if certainty exceeds a predefined threshold. If a cluster is determined to certainly comprise homogeneous responses, all remaining responses are coded accordingly automatically. We evaluated the method in a simulation using different data sets. Results With an average miscoding of about 3%, the method reduced the manual coding effort by an average of about 52%. Conclusion Combining the advantages of automatic and manual coding produces considerable coding accuracy and reduces the required manual effort.
引用
收藏
页码:841 / 854
页数:14
相关论文
共 50 条
  • [41] Petri net models for the semi-automatic construction of large scale biological networks
    Ming Chen
    Sridhar Hariharaputran
    Ralf Hofestädt
    Benjamin Kormeier
    Sarah Spangardt
    Natural Computing, 2011, 10 : 1077 - 1097
  • [42] Broadband RCS reduction for electrically-large open-ended cavity using random coding metasurfaces
    Zhou, Yang
    Yang, Yao
    Xie, Jianliang
    Chen, Haiyan
    Zhang, Guori
    Li, Fengxia
    Zhang, Li
    Wang, Xin
    Weng, Xiaolong
    Zhou, Peiheng
    Li, Xiaoqiu
    Deng, Longjiang
    JOURNAL OF PHYSICS D-APPLIED PHYSICS, 2019, 52 (31)
  • [43] Spelling correction with large language models to reduce measurement error in open-ended survey responses
    Allamong, Maxwell B.
    Jeong, Jongwoo
    Kellstedt, Paul M.
    RESEARCH & POLITICS, 2025, 12 (01)
  • [44] How do patients and families evaluate attitude of psychiatrists in Japan?: quantitative content analysis of open-ended items of patient responses from a large-scale questionnaire survey
    Ikuko Natsukari
    Mari Higuchi
    Tai Tsujimoto
    BMC Psychiatry, 23
  • [45] How do patients and families evaluate attitude of psychiatrists in Japan?: quantitative content analysis of open-ended items of patient responses from a large-scale questionnaire survey
    Natsukari, Ikuko
    Higuchi, Mari
    Tsujimoto, Tai
    BMC PSYCHIATRY, 2023, 23 (01)
  • [46] A Large-scale Dataset of (Open Source) License Text Variants
    Zacchiroli, Stefano
    2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), 2022, : 757 - 761
  • [47] Improving and Analyzing Open-Ended Survey Responses A Case Study Linking Psychological Theories and Analysis Approaches for Text Data
    Hahn, Sonja
    Kroehne, Ulf
    Merk, Samuel
    ZEITSCHRIFT FUR PSYCHOLOGIE-JOURNAL OF PSYCHOLOGY, 2024, 232 (03): : 171 - 180
  • [48] Importance of data preparation when analysing written responses to open-ended questions: An empirical assessment and comparison with manual coding
    Jaeger, Sara R.
    Rasmussen, Morten A.
    FOOD QUALITY AND PREFERENCE, 2021, 93
  • [49] The Research on Automatic Construction Techniques of Large-scale Corpus for Chinese Text Categorization
    Hu, Yan
    Wu, Wei
    Miao, Miao
    IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 640 - 645
  • [50] AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering
    Chen, Xiuyuan
    Lin, Yuan
    Zhang, Yuchen
    Huang, Weiran
    COMPUTER VISION - ECCV 2024, PT XXXVII, 2025, 15095 : 179 - 195