Creating Thorough Tests for AI-Generated Code is Hard

被引:0
|
作者
Singhal, Shreya [1 ]
Kumar, Viraj [2 ]
机构
[1] Indian Inst Technol Madras, Chennai, Tamil Nadu, India
[2] Indian Inst Sci, Bangalore, Karnataka, India
关键词
D O I
10.1145/3627217.3627238
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Before implementing a function, programmers are encouraged to write a suite of test cases that specify its intended behaviour on several inputs. A suite of tests is thorough if any buggy implementation fails at least one of these tests. We posit that as the proportion of code generated by Large Language Models (LLMs) grows, so must the ability of students to create test suites that are thorough enough to detect subtle bugs in such code. Our paper makes two contributions. First, we demonstrate how difficult it can be to create thorough tests for LLM-generated code by evaluating 27 test suites from a public dataset (EvalPlus). Second, by identifying deficiencies in these test suites, we propose strategies for improving the ability of students to develop thorough test suites for LLM-generated code.
引用
收藏
页码:108 / 111
页数:4
相关论文
共 50 条
  • [1] AI-Generated Code Not Considered Harmful
    Kendon, Tyson
    Wu, Leanne
    Aycock, John
    PROCEEDINGS OF THE 25TH WESTERN CANADIAN CONFERENCE ON COMPUTING EDUCATION, 2023,
  • [2] Navigating (in)security of AI-generated code
    Ambati, Sri Haritha
    Ridley, Norah
    Branca, Enrico
    Stakhanova, Natalia
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 30 - 37
  • [3] Validating AI-Generated Code with Live Programming
    Ferdowsi, Kasra
    Huang, Ruanqianqian
    James, Michael B.
    Polikarpova, Nadia
    Lerner, Sorin
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [4] DeVAIC: : A tool for security assessment of AI-generated code
    Cotroneo, Domenico
    De Luca, Roberta
    Liguori, Pietro
    INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 177
  • [5] Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education
    Pan, Wei Hung
    Chok, Ming Jie
    Wong, Jonathan Leong Shan
    Shin, Yung Xin
    Poon, Yeong Shian
    Yang, Zhou
    Chong, Chun Yong
    Lo, David
    Lim, Mei Kuan
    2024 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING EDUCATION AND TRAINING, ICSE-SEET 2024, 2024, : 1 - 11
  • [6] A Quantitative Analysis of Quality and Consistency in AI-generated Code
    Clark, Autumn
    Igbokwe, Daniel
    Ross, Samantha
    Zibran, Minhaz F.
    2024 7TH INTERNATIONAL CONFERENCE ON SOFTWARE AND SYSTEM ENGINEERING, ICOSSE 2024, 2024, : 37 - 41
  • [7] EX-CODE: A Robust and Explainable Model to Detect AI-Generated Code
    Bulla, Luana
    Midolo, Alessandro
    Mongiovi, Misael
    Tramontana, Emiliano
    INFORMATION, 2024, 15 (12)
  • [8] Automating the correctness assessment of AI-generated code for security contexts
    Cotroneo, Domenico
    Foggia, Alessio
    Improta, Cristina
    Liguori, Pietro
    Natella, Roberto
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 216
  • [9] Poisoning Programs by Un-Repairing Code: Security Concerns of AI-generated Code
    Improta, Cristina
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS, ISSREW, 2023, : 128 - 131
  • [10] ColDeco: An End User Spreadsheet Inspection Tool for AI-Generated Code
    Ferdowsi, Kasra
    Williams, Jack
    Drosos, Ian
    Gordon, Andrew D.
    Negreanu, Carina
    Polikarpova, Nadia
    Sarkar, Advait
    Zorn, Benjamin
    2023 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING, VL/HCC, 2023, : 82 - 91