Using GitHub Copilot for Test Generation in Python']Python: An Empirical Study

被引：0

作者：

El Haji, Khalid ^{[1
]}

Brandt, Carolin ^{[1
]}

Zaidman, Andy ^{[1
]}

机构：

[1] Delft Univ Technol, Delft, Netherlands

来源：

PROCEEDINGS OF THE 2024 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST 2024 | 2024年

关键词：

D O I：

10.1145/3644032.3644443

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Writing unit tests is a crucial task in software development, but it is also recognized as a time-consuming and tedious task. As such, numerous test generation approaches have been proposed and investigated. However, most of these test generation tools produce tests that are typically difficult to understand. Recently, Large Language Models (LLMs) have shown promising results in generating source code and supporting software engineering tasks. As such, we investigate the usability of tests generated by GitHub Copilot, a proprietary closed-source code generation tool that uses an LLM. We evaluate GitHub Copilot's test generation abilities both within and without an existing test suite, and we study the impact of different code commenting strategies on test generations. Our investigation evaluates the usability of 290 tests generated by GitHub Copilot for 53 sampled tests from open source projects. Our findings highlight that within an existing test suite, approximately 45.28% of the tests generated by Copilot are passing tests; 54.72% of generated tests are failing, broken, or empty tests. Furthermore, if we generate tests using Copilot without an existing test suite in place, we observe that 92.45% of the tests are failing, broken, or empty tests. Additionally, we study how test method comments influence the usability of test generations.

引用

页码：45 / 55

页数：11

共 50 条

[1] An empirical study of automated unit test generation for Python']Python
Lukasczyk, Stephan
Kroiss, Florian
Fraser, Gordon
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (02)
[2] An empirical study of automated unit test generation for Python
Stephan Lukasczyk
Florian Kroiß
Gordon Fraser
[J]. Empirical Software Engineering, 2023, 28
[3] On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot
Mastropaolo, Antonio
Pascarella, Luca
Guglielmi, Emanuela
Ciniselli, Matteo
Scalabrino, Simone
Oliveto, Rocco
Bavota, Gabriele
[J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2149 - 2160
[4] What Are the Dominant Projects in the GitHub Python']Python Ecosystem?
Ma, Wanwangying
Chen, Lin
Zhou, Yuming
Xu, Baowen
[J]. PROCEEDINGS 2016 THIRD INTERNATIONAL CONFERENCE ON TRUSTWORTHY SYSTEMS AND THEIR APPLICATIONS (TSA), 2016, : 87 - 95
[5] Empirical Study of Python']Python Call Graph
Li, Yu
[J]. 34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 1274 - 1276
[6] Using Relative Lines of Code to Guide Automated Test Generation for Python']Python
Holmes, Josie
Ahmed, Iftekhar
Brindescu, Caius
Gopinath, Rahul
Zhang, He
Groce, Alex
[J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2020, 29 (04)
[7] Generation of Test Questions from RDF Files Using PYTHON']PYTHON and SPARQL
Omarbekova, Assel
Sharipbay, Altynbek
Barlybaev, Alibek
[J]. 2017 INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND ARTIFICIAL INTELLIGENCE (CCEAI 2017), 2017, 806
[8] Pynguin: Automated Unit Test Generation for Python']Python
Lukasczyk, Stephan
Fraser, Gordon
[J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2022), 2022, : 168 - 172
[9] An Empirical Study of Flaky Tests in Python']Python
Gruber, Martin
Lukasczyk, Stephan
Krois, Florian
Fraser, Gordon
[J]. 2021 14TH IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2021), 2021, : 148 - 158
[10] An Empirical Study on Bugs in Python']Python Interpreters
Wang, Ziyuan
Bu, Dexin
Sun, Aiyue
Gou, Shanyi
Wang, Yong
Chen, Lin
[J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (02) : 716 - 734

← 1 2 3 4 5 →