An Empirical Evaluation of GitHub Copilot's Code Suggestions

被引：84

作者：

Nhan Nguyen ^{[1
]}

Nadi, Sarah ^{[1
]}

机构：

[1] Univ Alberta, Edmonton, AB, Canada

来源：

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022) | 2022年

关键词：

Program Synthesis; Codex; GitHub Copilot; Empirical Evaluation;

D O I：

10.1145/3524842.3528470

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

GitHub and OpenAI recently launched Copilot, an "AI pair programmer" that utilizes the power of Natural Language Processing, Static Analysis, Code Synthesis, and Artificial Intelligence. Given a natural language description of the target functionality, Copilot can generate corresponding code in several programming languages. In this paper, we perform an empirical study to evaluate the correctness and understandability of Copilot's suggested code. We use 33 LeetCode questions to create queries for Copilot in four different programming languages. We evaluate the correctness of the corresponding 132 Copilot solutions by running LeetCode's provided tests, and evaluate understandability using SonarQube's cyclomatic complexity and cognitive complexity metrics. We find that Copilot's Java suggestions have the highest correctness score (57%) while JavaScript is the lowest (27%). Overall, Copilot's suggestions have low complexity with no notable differences between the programming languages. We also find some potential Copilot shortcomings, such as generating code that can be further simplified and code that relies on undefined helper methods.

引用

页码：1 / 5

页数：5

共 50 条

[1] On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot
Mastropaolo, Antonio
Pascarella, Luca
Guglielmi, Emanuela
Ciniselli, Matteo
Scalabrino, Simone
Oliveto, Rocco
Bavota, Gabriele
[J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, : 2149 - 2160
[2] Assessing the Quality of GitHub Copilot's Code Generation
Yetistiren, Burak
Ozsoy, Isik
Tuzun, Eray
[J]. PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING, PROMISE 2022, 2022, : 62 - 71
[3] Is GitHub’s Copilot as bad as humans at introducing vulnerabilities in code?
Owura Asare
Meiyappan Nagappan
N. Asokan
[J]. Empirical Software Engineering, 2023, 28
[4] Is GitHub's Copilot as bad as humans at introducing vulnerabilities in code?
Asare, Owura
Nagappan, Meiyappan
Asokan, N.
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (06)
[5] Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions
Pearce, Hammond
Ahmad, Baleegh
Tan, Benjamin
Dolan-Gavitt, Brendan
Karri, Ramesh
[J]. 43RD IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2022), 2022, : 754 - 768
[6] Measuring GitHub Copilot's Impact on Productivity
Ziegler, Albert
Kalliamvakou, Eirini
Li, X. Alice
Rice, Andrew
Rifkin, Devon
Simister, Shawn
Sittampalam, Ganesh
Aftandilian, Edward
[J]. COMMUNICATIONS OF THE ACM, 2024, 67 (03) : 54 - 63
[7] Using GitHub Copilot for Test Generation in Python']Python: An Empirical Study
El Haji, Khalid
Brandt, Carolin
Zaidman, Andy
[J]. PROCEEDINGS OF THE 2024 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST 2024, 2024, : 45 - 55
[8] Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study
Imai, Saki
[J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2022), 2022, : 319 - 321
[9] Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot
Koyanagi, Kei
Wang, Dong
Noguchi, Kotaro
Kondo, Masanari
Serebrenik, Alexander
Kamei, Yasutaka
Ubayashi, Naoyasu
[J]. 2024 IEEE/ACM 21ST INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2024, : 481 - 486
[10] CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot
Niu, Liang
Mirza, Shujaat
Maradni, Zayd
Popper, Christina
[J]. PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2133 - 2150

← 1 2 3 4 5 →