Grading Documentation with Machine Learning

被引:0
|
作者
Messer, Marcus [1 ]
Shi, Miaojing [2 ]
Brown, Neil C. C. [1 ]
Kolling, Michael [1 ]
机构
[1] Kings Coll London, Dept Informat, London, England
[2] Tongji Univ, Coll Elect & Informat Engn, Shanghai, Peoples R China
关键词
Automated Grading; Assessment; Computer Science Education; Machine Learning; Large Language Models; Documentation; Programming Education;
D O I
10.1007/978-3-031-64302-6_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Professional developers, and especially students learning to program, often write poor documentation. While automated assessment for programming is becoming more common in educational settings, often using unit tests for code functionality and static analysis for code quality, documentation assessment is typically limited to detecting the presence and the correct formatting of a docstring based on a specified style guide. We aim to investigate how machine learning can be utilised to aid in automating the assessment of documentation quality. We classify a large set of publicly available human-annotated relevance scores between a natural language string and a code string, using traditional approaches, such as Logistic Regression and Random Forest, fine-tuned large language models, such as BERT and GPT, and Low-Rank Adaptation of large language models. Our most accurate mode was a fine-tuned CodeBERT model, resulting in a test accuracy of 89%.
引用
收藏
页码:105 / 117
页数:13
相关论文
共 50 条
  • [31] Design and Implementation of Machine Learning Algorithms in Automatic Grading of Students' Assignments
    Chen, Duo
    Xu, Fang
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 899 - 919
  • [32] Efficient Fruit Grading System Using Spectrophotometry and Machine Learning Approaches
    Chopra, Hetarth
    Singh, Harsh
    Bamrah, Manpreet Singh
    Mahbubani, Falesh
    Verma, Ashish
    Hooda, Nishtha
    Rana, Prashant Singh
    Singla, Rohit Kumar
    Singh, Anant Kumar
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (14) : 16162 - 16169
  • [33] Machine Learning Models for Multiparametric Glioma Grading With Quantitative Result Interpretations
    Wang, Xiuying
    Wang, Dingqian
    Yao, Zhigang
    Xin, Bowen
    Wang, Bao
    Lan, Chuanjin
    Qin, Yejun
    Xu, Shangchen
    He, Dazhong
    Liu, Yingchao
    [J]. FRONTIERS IN NEUROSCIENCE, 2019, 12
  • [34] Automatic Gleason grading of prostate cancer using SLIM and machine learning
    Nguyen, Tan H.
    Sridharan, Shamira
    Marcias, Virgilia
    Balla, Andre K.
    Do, Minh N.
    Popescu, Gabriel
    [J]. QUANTITATIVE PHASE IMAGING II, 2016, 9718
  • [35] Isolation and Grading of Faults in Battery Packs Based on Machine Learning Methods
    Yang, Sen
    Xu, Boran
    Peng, Hanlin
    [J]. ELECTRONICS, 2022, 11 (09)
  • [36] Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review
    Galhardi, Lucas Busatta
    Brancher, Jacques Duilio
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2018, 2018, 11238 : 380 - 391
  • [37] A review on machine learning techniques for the assessment of image grading in breast mammogram
    Khalil ur Rehman
    Jianqiang Li
    Yan Pei
    Anaa Yasin
    [J]. International Journal of Machine Learning and Cybernetics, 2022, 13 : 2609 - 2635
  • [38] Disease Classification and Grading of Orange using Machine Learning and Fuzzy Logic
    Behera, Santi Kumari
    Jena, Lipsarani
    Rath, Amiya Kumar
    Sethy, Prabira Kumar
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 678 - 682
  • [39] Machine Learning Approaches on Diagnostic Term Encoding With the ICD for Clinical Documentation
    Atutxa, Aitziber
    Perez, Alicia
    Casillas, Arantza
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2018, 22 (04) : 1323 - 1329
  • [40] Application of hyperspectral imaging for cocoa bean grading with machine learning approaches
    Liu, Na
    Gonzalez, Juan Manuel
    Ottestad, Silje
    Hernandez, Julio
    [J]. HYPERSPECTRAL IMAGING AND APPLICATIONS II, 2022, 12338