Creating Thorough Tests for AI-Generated Code is Hard

被引：0

作者：

Singhal, Shreya ^{[1
]}

Kumar, Viraj ^{[2
]}

机构：

[1] Indian Inst Technol Madras, Chennai, Tamil Nadu, India

[2] Indian Inst Sci, Bangalore, Karnataka, India

来源：

PROCEEDINGS OF THE 16TH ANNUAL ACM INDIA COMPUTE CONFERENCE, COMPUTE 2023 | 2023年

关键词：

D O I：

10.1145/3627217.3627238

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Before implementing a function, programmers are encouraged to write a suite of test cases that specify its intended behaviour on several inputs. A suite of tests is thorough if any buggy implementation fails at least one of these tests. We posit that as the proportion of code generated by Large Language Models (LLMs) grows, so must the ability of students to create test suites that are thorough enough to detect subtle bugs in such code. Our paper makes two contributions. First, we demonstrate how difficult it can be to create thorough tests for LLM-generated code by evaluating 27 test suites from a public dataset (EvalPlus). Second, by identifying deficiencies in these test suites, we propose strategies for improving the ability of students to develop thorough test suites for LLM-generated code.

引用

页码：108 / 111

页数：4

共 50 条

[1] AI-Generated Code Not Considered Harmful
Kendon, Tyson
Wu, Leanne
Aycock, John
PROCEEDINGS OF THE 25TH WESTERN CANADIAN CONFERENCE ON COMPUTING EDUCATION, 2023,
[2] Navigating (in)security of AI-generated code
Ambati, Sri Haritha
Ridley, Norah
Branca, Enrico
Stakhanova, Natalia
2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 30 - 37
[3] Validating AI-Generated Code with Live Programming
Ferdowsi, Kasra
Huang, Ruanqianqian
James, Michael B.
Polikarpova, Nadia
Lerner, Sorin
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
[4] DeVAIC: : A tool for security assessment of AI-generated code
Cotroneo, Domenico
De Luca, Roberta
Liguori, Pietro
INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 177
[5] Assessing AI Detectors in Identifying AI-Generated Code: Implications for Education
Pan, Wei Hung
Chok, Ming Jie
Wong, Jonathan Leong Shan
Shin, Yung Xin
Poon, Yeong Shian
Yang, Zhou
Chong, Chun Yong
Lo, David
Lim, Mei Kuan
2024 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING EDUCATION AND TRAINING, ICSE-SEET 2024, 2024, : 1 - 11
[6] A Quantitative Analysis of Quality and Consistency in AI-generated Code
Clark, Autumn
Igbokwe, Daniel
Ross, Samantha
Zibran, Minhaz F.
2024 7TH INTERNATIONAL CONFERENCE ON SOFTWARE AND SYSTEM ENGINEERING, ICOSSE 2024, 2024, : 37 - 41
[7] EX-CODE: A Robust and Explainable Model to Detect AI-Generated Code
Bulla, Luana
Midolo, Alessandro
Mongiovi, Misael
Tramontana, Emiliano
INFORMATION, 2024, 15 (12)
[8] Automating the correctness assessment of AI-generated code for security contexts
Cotroneo, Domenico
Foggia, Alessio
Improta, Cristina
Liguori, Pietro
Natella, Roberto
JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 216
[9] Poisoning Programs by Un-Repairing Code: Security Concerns of AI-generated Code
Improta, Cristina
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS, ISSREW, 2023, : 128 - 131
[10] ColDeco: An End User Spreadsheet Inspection Tool for AI-Generated Code
Ferdowsi, Kasra
Williams, Jack
Drosos, Ian
Gordon, Andrew D.
Negreanu, Carina
Polikarpova, Nadia
Sarkar, Advait
Zorn, Benjamin
2023 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN-CENTRIC COMPUTING, VL/HCC, 2023, : 82 - 91

← 1 2 3 4 5 →