LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

被引:0
|
作者
Yang, Kaiyu [1 ]
Swope, Aidan M. [2 ]
Gu, Alex
Chalamala, Rahul [1 ]
Song, Peiyang [3 ]
Yu, Shixing [4 ]
Godil, Saad
Prenger, Ryan [2 ]
Anandkumar, Anima [1 ,2 ]
机构
[1] CALTECH, Pasadena, CA 91125 USA
[2] NVIDIA, Santa Clara, CA USA
[3] UC Santa Barbara, Santa Barbara, CA USA
[4] UT Austin, Austin, TX USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection-a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library. It is inexpensive and needs only one GPU week of training. Our retriever leverages LeanDojo's program analysis capability to identify accessible premises and hard negative examples, which makes retrieval much more effective. Furthermore, we construct a new benchmark consisting of 98,734 theorems and proofs extracted from Lean's math library. It features challenging data split requiring the prover to generalize to theorems relying on novel premises that are never used in training. We use this benchmark for training and evaluation, and experimental results demonstrate the effectiveness of ReProver over non-retrieval baselines and GPT-4. We thus provide the first set of open-source LLM-based theorem provers without any proprietary datasets and release it under a permissive MIT license to facilitate further research.
引用
收藏
页数:40
相关论文
共 50 条
  • [41] Leveraging Retrieval-Augmented Generation for Swahili Language Conversation Systems
    Ndimbo, Edmund V.
    Luo, Qin
    Fernando, Gimo C.
    Yang, Xu
    Wang, Bang
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [42] Enhanced Recommendation Systems with Retrieval-Augmented Large Language Model
    Wei, Chuyuan
    Duan, Ke
    Zhuo, Shengda
    Wang, Hongchun
    Huang, Shuqiang
    Liu, Jie
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2025, 82 : 1147 - 1173
  • [43] A Retrieval-Augmented Framework for Tabular Interpretation with Large Language Model
    Yan, Mengyi
    Rene, Weilong
    Wang, Yaoshu
    Li, Jianxin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 341 - 356
  • [44] Advancing Cyber Incident Timeline Analysis Through Retrieval-Augmented Generation and Large Language Models
    Loumachi, Fatma Yasmine
    Ghanem, Mohamed Chahine
    Ferrag, Mohamed Amine
    COMPUTERS, 2025, 14 (02)
  • [45] Performance of Retrieval-Augmented Large Language Models to Recommend Head and Neck Cancer Clinical Trials
    Hung, Tony K. W.
    Kuperman, Gilad J.
    Sherman, Eric J.
    Ho, Alan L.
    Weng, Chunhua
    Pfister, David G.
    Mao, Jun J.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [46] OG-RAG: ONTOLOGY-GROUNDED RETRIEVAL-AUGMENTED GENERATION FOR LARGE LANGUAGE MODELS
    Sharma, Kartik
    Kumar, Peeyush
    Li, Yunqing
    arXiv,
  • [47] Leveraging Retrieval-Augmented Generation for Reliable Medical Question Answering Using Large Language Models
    Kharitonova, Ksenia
    Perez-Fernandez, David
    Gutierrez-Hernando, Javier
    Gutierrez-Fandino, Asier
    Callejas, Zoraida
    Griol, David
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, HAIS 2024, 2025, 14858 : 141 - 153
  • [48] Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models
    Salemi, Alireza
    Zamani, Hamed
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 741 - 751
  • [49] Retrieval-augmented Image Captioning
    Ramos, Rita
    Elliott, Desmond
    Martins, Bruno
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 3666 - 3681
  • [50] Optimizing Retrieval-augmented Reader Models via Token Elimination
    Berchansky, Moshe
    Izsak, Peter
    Caciularu, Avi
    Dagan, Ido
    Wasserblat, Moshe
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 1506 - 1524