Grading Documentation with Machine Learning

被引:0
|
作者
Messer, Marcus [1 ]
Shi, Miaojing [2 ]
Brown, Neil C. C. [1 ]
Kolling, Michael [1 ]
机构
[1] Kings Coll London, Dept Informat, London, England
[2] Tongji Univ, Coll Elect & Informat Engn, Shanghai, Peoples R China
关键词
Automated Grading; Assessment; Computer Science Education; Machine Learning; Large Language Models; Documentation; Programming Education;
D O I
10.1007/978-3-031-64302-6_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Professional developers, and especially students learning to program, often write poor documentation. While automated assessment for programming is becoming more common in educational settings, often using unit tests for code functionality and static analysis for code quality, documentation assessment is typically limited to detecting the presence and the correct formatting of a docstring based on a specified style guide. We aim to investigate how machine learning can be utilised to aid in automating the assessment of documentation quality. We classify a large set of publicly available human-annotated relevance scores between a natural language string and a code string, using traditional approaches, such as Logistic Regression and Random Forest, fine-tuned large language models, such as BERT and GPT, and Low-Rank Adaptation of large language models. Our most accurate mode was a fine-tuned CodeBERT model, resulting in a test accuracy of 89%.
引用
收藏
页码:105 / 117
页数:13
相关论文
共 50 条
  • [1] Documentation of Machine Learning Software
    Hashemi, Yalda
    Nayebi, Maleknaz
    Antoniol, Giuliano
    [J]. PROCEEDINGS OF THE 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER '20), 2020, : 666 - 667
  • [2] Machine Learning Techniques for Grading of PowerPoint Slides
    Borade, Jyoti G.
    Netak, Laxman D.
    [J]. INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2021, 2022, 13184 : 3 - 15
  • [3] Understanding Implementation Challenges in Machine Learning Documentation
    Chang, Jiyoo
    Custis, Christine
    [J]. ACM CONFERENCE ON EQUITY AND ACCESS IN ALGORITHMS, MECHANISMS, AND OPTIMIZATION, EAAMO 2022, 2022,
  • [4] A Machine Learning Grading System Using Chatbots
    Ndukwe, Ifeanyi G.
    Daniel, Ben K.
    Amadi, Chukwudi E.
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2019, PT II, 2019, 11626 : 365 - 368
  • [5] Question Independent Grading using Machine Learning: The Case of Computer Program Grading
    Singh, Gursimran
    Srikant, Shashank
    Aggarwal, Varun
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 263 - 272
  • [6] Automatic Detection and Grading of Multiple Fruits by Machine Learning
    Anuja Bhargava
    Atul Bansal
    [J]. Food Analytical Methods, 2020, 13 : 751 - 761
  • [7] Automatic Detection and Grading of Multiple Fruits by Machine Learning
    Bhargava, Anuja
    Barisal, Atul
    [J]. FOOD ANALYTICAL METHODS, 2020, 13 (03) : 751 - 761
  • [8] Cardamom Grading - a solution through machine learning techniques
    Jose, Renu Mary
    Krishnan, Sunitha K. S.
    [J]. 2015 GLOBAL CONFERENCE ON COMMUNICATION TECHNOLOGIES (GCCT), 2015, : 299 - 302
  • [9] Detection and Grading of Different Vegetable Using Machine Learning
    Bhargava, Anuja
    Bansal, Atul
    Goyal, Vishal
    [J]. ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 671 - 679
  • [10] Automated Essay Grading using Machine Learning Algorithm
    Ramalingam, V. V.
    Pandian, A.
    Chetry, Prateek
    Nigam, Himanshu
    [J]. PROCEEDINGS OF THE 10TH NATIONAL CONFERENCE ON MATHEMATICAL TECHNIQUES AND ITS APPLICATIONS (NCMTA 18), 2018, 1000