Oracle-Guided Program Selection from Large Language Models

被引:0
|
作者
Fan, Zhiyu [1 ]
Ruan, Haifeng [1 ]
Mechtaev, Sergey [2 ]
Roychoudhury, Abhik [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Peking Univ, Beijing, Peoples R China
基金
新加坡国家研究基金会;
关键词
large language model; code generation; oracle inference; differential testing;
D O I
10.1145/3650212.3680308
中图分类号
学科分类号
摘要
While large language models (LLMs) have shown significant advancements in code generation, their susceptibility to producing incorrect code poses a significant challenge to the adoption of LLM-generated programs. This issue largely stems from the reliance on natural language descriptions as informal oracles in code generation. Current strategies to mitigate this involve selecting the best program from multiple LLM-generated alternatives, judged by criteria like the consistency of their execution results on an LLM-generated test suite. However, this approach has crucial limitations: (1) LLMs often generate redundant tests or tests that cannot distinguish between correct and incorrect solutions, (2) the used consistency criteria, such as the majority vote, fail to foster developer trust due to the absence of transparent rationale behind the made choices. In this work, we propose a new perspective on increasing the quality of LLM-generated code via program selection using the LLM as a test oracle. Our method is based on our experimentally confirmed observation that LLMs serve more effectively as oracles when tasked with selecting the correct output from multiple choices. Leveraging this insight, we first generate distinguishing inputs that capture semantic discrepancies of programs sampled from an LLM, and record outputs produced by the programs on these inputs. An LLM then selects the most likely to be correct output from these, guided by the natural language problem description. We implemented this idea in a tool LLIVIConECHoicf and evaluated its accuracy in generating and selecting standalone programs. Our experiments demonstrated its effectiveness in improving pass01 by 3.6-7% on HumanEval and MBPP benchmarks compared to the state-of-art cont:T. Most interestingly, the selected input-output specifications helped us to uncover incompleteness and ambiguities in task descriptions and also identify incorrect ground-truth implementations in the benchmarks.
引用
收藏
页码:628 / 640
页数:13
相关论文
共 50 条
  • [1] Improving Oracle-Guided Inductive Synthesis by Efficient Question Selection
    Ji, Ruyi
    Kong, Chaozhe
    Xiong, Yingfei
    Hu, Zhenjiang
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (OOPSLA):
  • [2] Worldline algorithm by oracle-guided variational autoregressive network
    Shi, Zhifang
    Cao, Yuchuang
    Gu, Qiangqiang
    Feng, Ji
    PHYSICAL REVIEW B, 2021, 104 (09)
  • [3] On Hardware Trojan Detection using Oracle-Guided Circuit Learning
    Datta, Rajesh Kumar
    Zhao, Guangwei
    Jain, Dipali
    Shamsi, Kaveh
    Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI, : 198 - 203
  • [4] Oracle-guided scheduling for controlling granularity in implicitly parallel languages
    Acar, Umut A.
    Chargueraud, Arthur
    Rainey, Mike
    JOURNAL OF FUNCTIONAL PROGRAMMING, 2016, 26
  • [5] An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
    Andriushchenko, Roman
    Ceska, Milan
    Macak, Filip
    Junges, Sebastian
    Katoen, Joost-Pieter
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2025, 82 : 433 - 469
  • [6] Circuit Obfuscation and Oracle-guided Attacks: Who can Prevail?
    Shamsi, Kaveh
    Li, Meng
    Meade, Travis
    Zhao, Zheng
    Pan, David Z.
    Jin, Yier
    PROCEEDINGS OF THE GREAT LAKES SYMPOSIUM ON VLSI 2017 (GLSVLSI' 17), 2017, : 357 - 362
  • [7] TaintLock: Hardware IP Protection Against Oracle-Guided and Oracle-Reconstruction Attacks
    Talukdar, Jonti
    Chaudhuri, Arjun
    Ortega, Eduardo
    Chakrabarty, Krishnendu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2025, 44 (01) : 357 - 370
  • [8] On the Security of Sequential Logic Locking Against Oracle-Guided Attacks
    Hu, Yinghua
    Zhang, Yuke
    Yang, Kaixin
    Chen, Dake
    Beerel, Peter A.
    Nuzzo, Pierluigi
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 3628 - 3641
  • [9] On Hardware Trojan Detection using Oracle-Guided Circuit Learning
    Datta, Rajesh Kumar
    Zhao, Guangwei
    Jain, Dipali Deepak
    Shamsi, Kaveh
    PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 198 - 203
  • [10] Oracle-Guided Deep Reinforcement Learning for Large-Scale Multi-UAVs Flocking and Navigation
    Wang, Wen
    Wang, Liang
    Wu, Junfeng
    Tao, Xianping
    Wu, Haijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (10) : 10280 - 10292