LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

被引:0
|
作者
Yang, Kaiyu [1 ]
Swope, Aidan M. [2 ]
Gu, Alex
Chalamala, Rahul [1 ]
Song, Peiyang [3 ]
Yu, Shixing [4 ]
Godil, Saad
Prenger, Ryan [2 ]
Anandkumar, Anima [1 ,2 ]
机构
[1] CALTECH, Pasadena, CA 91125 USA
[2] NVIDIA, Santa Clara, CA USA
[3] UC Santa Barbara, Santa Barbara, CA USA
[4] UT Austin, Austin, TX USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection-a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library. It is inexpensive and needs only one GPU week of training. Our retriever leverages LeanDojo's program analysis capability to identify accessible premises and hard negative examples, which makes retrieval much more effective. Furthermore, we construct a new benchmark consisting of 98,734 theorems and proofs extracted from Lean's math library. It features challenging data split requiring the prover to generalize to theorems relying on novel premises that are never used in training. We use this benchmark for training and evaluation, and experimental results demonstrate the effectiveness of ReProver over non-retrieval baselines and GPT-4. We thus provide the first set of open-source LLM-based theorem provers without any proprietary datasets and release it under a permissive MIT license to facilitate further research.
引用
收藏
页数:40
相关论文
共 50 条
  • [21] TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
    Shanghai Jiao Tong University, China
    arXiv,
  • [22] Optimizing High-Level Synthesis Designs with Retrieval-Augmented Large Language Models
    Xu, Haocheng
    Hu, Haotian
    Huang, Sitao
    2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
  • [23] Adaptive Control of Retrieval-Augmented Generation for Large Language Models Through Reflective Tags
    Yao, Chengyuan
    Fujita, Satoshi
    ELECTRONICS, 2024, 13 (23):
  • [24] Towards an FA ChatBot with Retrieval-augmented Language Modeling
    Fichtenkamm, Maik
    Kofler, Markus
    Schekotihin, Konstantin
    Burmer, Christian
    2024 IEEE INTERNATIONAL SYMPOSIUM ON THE PHYSICAL AND FAILURE ANALYSIS OF INTEGRATED CIRCUITS, IPFA 2024, 2024,
  • [25] Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models
    Jeong, Minbyul
    Sohn, Jiwoong
    Sung, Mujeen
    Kang, Jaewoo
    BIOINFORMATICS, 2024, 40 : i119 - i129
  • [26] Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications
    Miao, Jing
    Thongprayoon, Charat
    Suppadungsuk, Supawadee
    Valencia, Oscar A. Garcia
    Cheungpasitporn, Wisit
    MEDICINA-LITHUANIA, 2024, 60 (03):
  • [27] Zero-Shot ECG Diagnosis with Large Language Models and Retrieval-Augmented Generation
    Yu, Han
    Guo, Peikun
    Sano, Akane
    MACHINE LEARNING FOR HEALTH, ML4H, VOL 225, 2023, 225 : 650 - 663
  • [28] GOODTRIEVER: Adaptive Toxicity Mitigation with Retrieval-augmented Models
    Pozzobon, Luiza
    Ermis, Beyza
    Lewis, Patrick
    Hooker, Sara
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5108 - 5125
  • [29] Can Small Language Models With Retrieval-Augmented Generation Replace Large Language Models When Learning Computer Science?
    Liu, Suqing
    Yu, Zezhu
    Huang, Feiran
    Bulbulia, Yousef
    Bergen, Andreas
    Liut, Michael
    PROCEEDINGS OF THE 2024 CONFERENCE INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, VOL 1, ITICSE 2024, 2024, : 388 - 393
  • [30] Building a Coding Assistant via the Retrieval-Augmented Language Model
    Li, Xinze
    Wang, Hanbin
    Liu, Zhenghao
    Yu, Shi
    Wang, Shuo
    Yan, Yukun
    Fu, Yukai
    Gu, Yu
    Yu, Ge
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)