CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models

被引：32

作者：

Lemieux, Caroline ^{[1
]}

Inala, Jeevana Priya ^{[2
]}

Lahiri, Shuvendu K. ^{[2
]}

Sen, Siddhartha ^{[2
]}

机构：

[1] Univ British Columbia, Vancouver, BC, Canada

[2] Microsoft Res, Redmond, WA USA

来源：

2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE | 2023年

关键词：

ALGORITHM;

D O I：

10.1109/ICSE48619.2023.00085

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Search-based software testing (SBST) generates high-coverage test cases for programs under test with a combination of test case generation and mutation. SBST's performance relies on there being a reasonable probability of generating test cases that exercise the core logic of the program under test. Given such test cases, SBST can then explore the space around them to exercise various parts of the program. This paper explores whether Large Language Models (LLMs) of code, such as OpenAI's Codex, can be used to help SBST's exploration. Our proposed algorithm, CODAMOSA, conducts SBST until its coverage improvements stall, then asks Codex to provide example test cases for under-covered functions. These examples help SBST redirect its search to more useful areas of the search space. On an evaluation over 486 benchmarks, CODAMOSA achieves statistically significantly higher coverage on many more benchmarks (173 and 279) than it reduces coverage on (10 and 4), compared to SBST and LLM-only baselines.

引用

页码：919 / 931

页数：13

共 50 条

[21] Pre-trained language models in medicine: A survey *
Luo, Xudong
Deng, Zhiqi
Yang, Binxia
Luo, Michael Y.
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
[22] Probing for Hyperbole in Pre-Trained Language Models
Schneidermann, Nina Skovgaard
Hershcovich, Daniel
Pedersen, Bolette Sandford
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211
[23] Adopting Pre-trained Large Language Models for Regional Language Tasks: A Case Study
Gaikwad, Harsha
Kiwelekar, Arvind
Laddha, Manjushree
Shahare, Shashank
[J]. INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 15 - 25
[24] Controllable Generation from Pre-trained Language Models via Inverse Prompting
Zou, Xu
Yin, Da
Zhong, Qingyang
Yang, Hongxia
Yang, Zhilin
Tang, Jie
[J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2450 - 2460
[25] An Investigation of Suitability of Pre-Trained Language Models for Dialogue Generation - Avoiding Discrepancies
Zeng, Yan
Nie, Jian-Yun
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4481 - 4494
[26] Addressing Extraction and Generation Separately: Keyphrase Prediction With Pre-Trained Language Models
Liu, Rui
Lin, Zheng
Wang, Weiping
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3180 - 3191
[27] Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Yu, Dian
Yu, Zhou
Sagae, Kenji
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2251 - 2268
[28] Clinical efficacy of pre-trained large language models through the lens of aphasia
Cong, Yan
Lacroix, Arianna N.
Lee, Jiyeon
[J]. SCIENTIFIC REPORTS, 2024, 14 (01):
[29] Editorial for Special Issue on Pre-trained Large Language Models for Information Processing
Wang, Bin
Kawahara, Tatsuya
Li, Haizhou
Meng, Helen
Wu, Chung-Hsien
[J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2024, 13 (02)
[30] The Use and Misuse of Pre-Trained Generative Large Language Models in Reliability Engineering
Hu, Yunwei
Goktas, Yavuz
Yellamati, David Deepak
De Tassigny, Catherine
[J]. 2024 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, RAMS, 2024,

← 1 2 3 4 5 →