A Data-Driven Approach for the Identification of Features for Automated Feedback on Academic Essays

被引:2
|
作者
Abbas, Mohsin [1 ,2 ]
van Rosmalen, Peter [3 ]
Kalz, Marco [4 ]
机构
[1] Open Univ Netherlands, Dept Strateg Management, Fac Management Sci, UNESCO Chair Open Educ, NL-6419 AT Heerlen, Netherlands
[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore 54590, Pakistan
[3] Maastricht Univ, Fac Hlth Med & Life Sci, Sch Hlth Profess Educ, Dept Educ Dev & Res, NL-6211 LK Maastricht, Netherlands
[4] Heidelberg Univ Educ, D-69120 Heidelberg, Germany
来源
关键词
Artificial neural networks; backward elimination; dimensionality reduction; feature reduction; feature selection; k-fold cross validation; Levenberg-Marquardt (LM); natural language processing; WRITING QUALITY; LINGUISTIC FEATURES; LEXICAL DIVERSITY; VOCABULARY SIZE; TEXT; RICHNESS; CHATGPT; IMPACT;
D O I
10.1109/TLT.2023.3320877
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
For predicting and improving the quality of essays, text analytic metrics (surface, syntactic, morphological, and semantic features) can be used to provide formative feedback to the students in higher education. In this study, the goal was to identify a sufficient number of features that exhibit a fair proxy of the scores given by the human raters via a data-driven approach. Using an existing corpus and a text analysis tool for the Dutch language, a large number of features were extracted. Artificial neural networks, Levenberg-Marquardt algorithm, and backward elimination were used to reduce the number of features automatically. Irrelevant features were eliminated based on the inter-rater agreement between predicted and human scores calculated using Cohen's kappa (kappa). The number of features in this study was reduced from 457 to 28 and grouped into different categories. The results reported in this article are an improvement over a similar previous study. First, the inter-rater reliability between the predicted scores and human raters was increased by tweaking the corpus for overfitting for average scores. The resulting maximum value of kappa showed substantial agreement compared to moderate inter-rater reliability in the prior study. Second, instead of using a dedicated training and test set, the training and testing phases in the new experiments were performed using k-fold cross validation on the corpus of texts. The approach presented in this research article is the first step toward our ultimate goal of providing meaningful formative feedback to the students for enhancing their writing skills and capabilities.
引用
收藏
页码:914 / 925
页数:12
相关论文
共 50 条
  • [41] A BIOMARKER APPROACH TO DATA-DRIVEN IDENTIFICATION OF ENDOTYPES IN KNEE OA PATIENTS
    Lisowska-Petersen, Z.
    Hannani, M. Toft
    Karsdal, M.
    Bager, C.
    Bay-Jensen, A. C.
    Thudium, C.
    ANNALS OF THE RHEUMATIC DISEASES, 2023, 82 : 1026 - 1026
  • [42] Spatio-temporal identification of hemodynamics in fMRI: A data-driven approach
    Yan, LR
    Hu, DW
    Zhou, ZT
    Liu, YD
    MEDICAL IMAGING AND AUGMENTED REALITY, PROCEEDINGS, 2004, 3150 : 213 - 220
  • [43] Phase distribution and properties identification of heterogeneous materials: A data-driven approach
    Valdes-Alonzo, Gabriel
    Binetruy, Christophe
    Eck, Benedikt
    Garcia-Gonzalez, Alberto
    Leygue, Adrien
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2022, 390
  • [44] Data-driven and Model-based Verification: a Bayesian Identification Approach
    Haesaert, S.
    Abate, A.
    Van den Hof, P. M. J.
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 6830 - 6835
  • [45] Phase distribution and properties identification of heterogeneous materials: A data-driven approach
    Valdés-Alonzo, Gabriel
    Binetruy, Christophe
    Eck, Benedikt
    García-González, Alberto
    Leygue, Adrien
    Computer Methods in Applied Mechanics and Engineering, 2022, 390
  • [46] Empowering scientists with data-driven automated experimentation
    Yang, Jonghee
    Ahmadi, Mahshid
    NATURE SYNTHESIS, 2023, 2 (06): : 462 - 463
  • [47] Data-driven identification approach for thruster misalignment angles of rigid satellite
    Zhang, Aihua
    Bing, Xiao
    Huo, Xing
    IET CONTROL THEORY AND APPLICATIONS, 2015, 9 (07): : 1111 - 1118
  • [48] A Robust Data-Driven Approach for Dynamics Model Identification in Trajectory Planning
    Chen, Jiangqiu
    Zhang, Minyu
    Yang, Zhifei
    Xia, Linqing
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 7104 - 7111
  • [49] An automated data-driven platform for buildings simulation
    Aryai, Vahid
    Mahdavi, Nariman
    West, Sam
    Henze, Gregor
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, : 61 - 68
  • [50] Empowering scientists with data-driven automated experimentation
    Jonghee Yang
    Mahshid Ahmadi
    Nature Synthesis, 2023, 2 : 462 - 463