A Data-Driven Approach for the Identification of Features for Automated Feedback on Academic Essays

被引：2

作者：

Abbas, Mohsin ^{[1
,2
]}

van Rosmalen, Peter ^{[3
]}

Kalz, Marco ^{[4
]}

机构：

[1] Open Univ Netherlands, Dept Strateg Management, Fac Management Sci, UNESCO Chair Open Educ, NL-6419 AT Heerlen, Netherlands

[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore 54590, Pakistan

[3] Maastricht Univ, Fac Hlth Med & Life Sci, Sch Hlth Profess Educ, Dept Educ Dev & Res, NL-6211 LK Maastricht, Netherlands

[4] Heidelberg Univ Educ, D-69120 Heidelberg, Germany

来源：

IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES | 2023年 / 16卷 / 06期

关键词：

Artificial neural networks; backward elimination; dimensionality reduction; feature reduction; feature selection; k-fold cross validation; Levenberg-Marquardt (LM); natural language processing; WRITING QUALITY; LINGUISTIC FEATURES; LEXICAL DIVERSITY; VOCABULARY SIZE; TEXT; RICHNESS; CHATGPT; IMPACT;

D O I：

10.1109/TLT.2023.3320877

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

For predicting and improving the quality of essays, text analytic metrics (surface, syntactic, morphological, and semantic features) can be used to provide formative feedback to the students in higher education. In this study, the goal was to identify a sufficient number of features that exhibit a fair proxy of the scores given by the human raters via a data-driven approach. Using an existing corpus and a text analysis tool for the Dutch language, a large number of features were extracted. Artificial neural networks, Levenberg-Marquardt algorithm, and backward elimination were used to reduce the number of features automatically. Irrelevant features were eliminated based on the inter-rater agreement between predicted and human scores calculated using Cohen's kappa (kappa). The number of features in this study was reduced from 457 to 28 and grouped into different categories. The results reported in this article are an improvement over a similar previous study. First, the inter-rater reliability between the predicted scores and human raters was increased by tweaking the corpus for overfitting for average scores. The resulting maximum value of kappa showed substantial agreement compared to moderate inter-rater reliability in the prior study. Second, instead of using a dedicated training and test set, the training and testing phases in the new experiments were performed using k-fold cross validation on the corpus of texts. The approach presented in this research article is the first step toward our ultimate goal of providing meaningful formative feedback to the students for enhancing their writing skills and capabilities.

引用

页码：914 / 925

页数：12

共 50 条

[1] Data-driven feedback algorithms for automated position identification and environment reconstruction of autonomous vehicles
Lioris, Jennie
Lebacque, Jean-Patrick
Seidowsky, Regine
Naeem, Muhammad
[J]. IFAC PAPERSONLINE, 2020, 53 (05): : 868 - 874
[2] A data-driven approach for the design of feedback controllers
Barbu, Marian
Ceanga, Emil
[J]. 2014 18TH INTERNATIONAL CONFERENCE SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2014, : 609 - 614
[3] Person re-identification with data-driven features
[J]. Li, Xiang, 1600, Springer Verlag (8833):
[4] Person Re-identification with Data-Driven Features
Li, Xiang
Gao, Jinyu
Chang, Xiaobin
Mai, Yuting
Zheng, Wei-Shi
[J]. BIOMETRIC RECOGNITION (CCBR 2014), 2014, 8833 : 506 - 513
[5] Automated Generation of Creative Software Requirements: A Data-Driven Approach
Quoc Anh Do
Bhowmik, Tanmay
[J]. WASPI'18: PROCEEDINGS OF THE 1ST ACM SIGSOFT INTERNATIONAL WORKSHOP ON AUTOMATED SPECIFICATION INFERENCE, 2018, : 9 - 12
[6] A novel data-driven bilinear subspace identification approach
Yang, Hua
Li, Shaoyuan
[J]. CANADIAN JOURNAL OF CHEMICAL ENGINEERING, 2007, 85 (01): : 122 - 126
[7] Identification and prediction of phubbing behavior: a data-driven approach
Rahman, Md Anisur
Duradoni, Mirko
Guazzini, Andrea
[J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (05): : 3885 - 3894
[8] Identification and prediction of phubbing behavior: a data-driven approach
Md Anisur Rahman
Mirko Duradoni
Andrea Guazzini
[J]. Neural Computing and Applications, 2022, 34 : 3885 - 3894
[9] DATA-DRIVEN AND CONCEPTUALLY DRIVEN ACADEMIC DISCOURSE
MACDONALD, SP
[J]. WRITTEN COMMUNICATION, 1989, 6 (04) : 411 - 435
[10] Data-Driven and Feedback Based Spectro-Temporal Features for Speech Recognition
Sivaram, G. S. V. S.
Nemala, Sridhar Krishna
Mesgarani, Nima
Hermansky, Hynek
[J]. IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (11) : 957 - 960

← 1 2 3 4 5 →