A Data-Driven Approach for the Identification of Features for Automated Feedback on Academic Essays

被引:2
|
作者
Abbas, Mohsin [1 ,2 ]
van Rosmalen, Peter [3 ]
Kalz, Marco [4 ]
机构
[1] Open Univ Netherlands, Dept Strateg Management, Fac Management Sci, UNESCO Chair Open Educ, NL-6419 AT Heerlen, Netherlands
[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore 54590, Pakistan
[3] Maastricht Univ, Fac Hlth Med & Life Sci, Sch Hlth Profess Educ, Dept Educ Dev & Res, NL-6211 LK Maastricht, Netherlands
[4] Heidelberg Univ Educ, D-69120 Heidelberg, Germany
来源
关键词
Artificial neural networks; backward elimination; dimensionality reduction; feature reduction; feature selection; k-fold cross validation; Levenberg-Marquardt (LM); natural language processing; WRITING QUALITY; LINGUISTIC FEATURES; LEXICAL DIVERSITY; VOCABULARY SIZE; TEXT; RICHNESS; CHATGPT; IMPACT;
D O I
10.1109/TLT.2023.3320877
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
For predicting and improving the quality of essays, text analytic metrics (surface, syntactic, morphological, and semantic features) can be used to provide formative feedback to the students in higher education. In this study, the goal was to identify a sufficient number of features that exhibit a fair proxy of the scores given by the human raters via a data-driven approach. Using an existing corpus and a text analysis tool for the Dutch language, a large number of features were extracted. Artificial neural networks, Levenberg-Marquardt algorithm, and backward elimination were used to reduce the number of features automatically. Irrelevant features were eliminated based on the inter-rater agreement between predicted and human scores calculated using Cohen's kappa (kappa). The number of features in this study was reduced from 457 to 28 and grouped into different categories. The results reported in this article are an improvement over a similar previous study. First, the inter-rater reliability between the predicted scores and human raters was increased by tweaking the corpus for overfitting for average scores. The resulting maximum value of kappa showed substantial agreement compared to moderate inter-rater reliability in the prior study. Second, instead of using a dedicated training and test set, the training and testing phases in the new experiments were performed using k-fold cross validation on the corpus of texts. The approach presented in this research article is the first step toward our ultimate goal of providing meaningful formative feedback to the students for enhancing their writing skills and capabilities.
引用
收藏
页码:914 / 925
页数:12
相关论文
共 50 条
  • [31] A Data-Driven Pole Placement Approach to Design of Recursive Delayed Feedback Control
    Aramata, Ryota
    Yamamoto, Shigeru
    2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 1022 - 1025
  • [32] An Approach to Data-Driven Design of Feedback Control Systems with Embedded Residual Generation
    Ding, Steven X.
    Wang, Yulei
    Yang, Ying
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 885 - 890
  • [33] Automated Fault Detection of Wind Turbine Gearbox using Data-Driven Approach
    Praveenl, Hemanth Mithun
    Tejas
    Sabareesh, G. R.
    INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT, 2019, 10 (01)
  • [34] A Data-Driven Automated Mitigation Approach for Resilient Wildfire Response in Power Systems
    Umunnakwe, Amarachi
    Davis, Katherine
    IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY, 2023, 10 : 665 - 677
  • [35] An automated, data-driven approach to children's social dynamics in space and time
    Horn, Lisa
    Karsai, Marton
    Markova, Gabriela
    CHILD DEVELOPMENT PERSPECTIVES, 2024, 18 (01) : 36 - 43
  • [36] Data-driven identification of inherent features of eukaryotic stress-responsive genes
    Latorre, Pablo
    Bottcher, Rene
    Nadal-Ribelles, Mariona
    Li, Constance H.
    Sole, Carme
    Martinez-Cebrian, Gerard
    Boutros, Paul C.
    Posas, Francesc
    de Nadal, Eulalia
    NAR GENOMICS AND BIOINFORMATICS, 2022, 4 (01)
  • [37] Data-Driven Modeling of Automated Vehicles: Koopman Operator Approach and Its Application
    Kim J.S.
    Chung C.C.
    Journal of Institute of Control, Robotics and Systems, 2022, 28 (11): : 1038 - 1044
  • [38] An Identification Approach for the Data-Driven SIR in the PnP Monitoring and Control Architecture
    Luo, Hao
    Liu, Tianyu
    Yin, Shen
    Kaynak, Okyay
    IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 5359 - 5364
  • [39] Data-Driven Feedback Linearization with Complete Dictionaries
    De Persis, C.
    Gadginmath, D.
    Pasqualetti, F.
    Tesi, P.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3037 - 3042
  • [40] Research on Data-driven Feedback Teaching Service
    Shu, Jiangbo
    Wang, Li
    Wang, Xu
    Zhi, Min
    Cao, Taihe
    Liu, Hai
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES ENHANCING EDUCATION (ICAT2E 2017), 2017, 68 : 97 - 102