SATLM: Satisfiability-Aided Language Models Using Declarative Prompting

被引:0
|
作者
Ye, Xi [1 ]
Chen, Qiaochu [1 ]
Dillig, Isil [1 ]
Durrett, Greg [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prior work has combined chain-of-thought prompting in large language models (LLMs) with programmatic representations to perform effective and transparent reasoning. While such an approach works well for tasks that only require forward reasoning (e.g., straightforward arithmetic), it is less effective for constraint solving problems that require more sophisticated planning and search. In this paper, we propose a new satisfiability-aided language modeling (SATLM) approach for improving the reasoning capabilities of LLMs. We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. This approach has two key advantages. The declarative specification is closer to the problem description than the reasoning steps are, so the LLM can parse it out of the description more accurately. Furthermore, by offloading the actual reasoning task to an automated theorem prover, our approach can guarantee the correctness of the answer with respect to the parsed specification and avoid planning errors in the solving process. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm. In particular, SATLM outperforms program-aided LMs by 23% on a challenging subset of the GSM arithmetic reasoning dataset; SATLM also achieves a new SoTA on LSAT and BOARDGAMEQA, surpassing previous models that are trained on the respective training sets.(1)
引用
收藏
页数:33
相关论文
共 50 条
  • [1] Translation Titans, Reasoning Challenges: Satisfiability-Aided Language Models for Detecting Conflicting Requirements
    Fazelnia, Mohamad
    Mirakhorli, Mehdi
    Bagheri, Hamid
    PROCEEDINGS OF 2024 39TH ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2024, 2024, : 2294 - 2298
  • [2] Considerations for Prompting Large Language Models
    Schulte, Brian
    JAMA ONCOLOGY, 2024, 10 (04) : 538 - 538
  • [3] Prompting Language Models for Linguistic Structure
    Blevins, Terra
    Gonen, Hila
    Zettlemoyer, Luke
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6649 - 6663
  • [4] Prompting Is Programming: A Query Language for Large Language Models
    Beurer-Kellner, Luca
    Fischer, Marc
    Vechev, Martin
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2023, 7 (PLDI):
  • [5] Graph Neural Prompting with Large Language Models
    Tian, Yijun
    Song, Huan
    Wang, Zichen
    Wang, Haozhu
    Hu, Ziqing
    Wang, Fang
    Chawla, Nitesh V.
    Xu, Panpan
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19080 - 19088
  • [6] Prompting Large Language Models With the Socratic Method
    Chang, Edward Y.
    2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, : 351 - 360
  • [7] Dehallucinating Large Language Models Using Formal Methods Guided Iterative Prompting
    Jha, Susmit
    Jha, Sumit Kumar
    Lincoln, Patrick
    Bastian, Nathaniel D.
    Velasquez, Alvaro
    Neema, Sandeep
    2023 IEEE INTERNATIONAL CONFERENCE ON ASSURED AUTONOMY, ICAA, 2023, : 149 - 152
  • [8] Counterexample Guided Inductive Synthesis Using Large Language Models and Satisfiability Solving
    Jha, Sumit Kumar
    Jha, Susmit
    Lincoln, Patrick
    Bastian, Nathaniel D.
    Velasquez, Alvaro
    Ewetz, Rickard
    Neema, Sandeep
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [9] Prompting Large Language Models to Power Educational Chatbots
    Farah, Juan Carlos
    Ingram, Sandy
    Spaenlehauer, Basile
    Lasne, Fanny Kim-Lan
    Gillet, Denis
    ADVANCES IN WEB-BASED LEARNING, ICWL 2023, 2023, 14409 : 169 - 188
  • [10] DDPrompt: Differential Diversity Prompting in Large Language Models
    Mu, Lin
    Zhang, Wenhan
    Zhang, Yiwen
    Jin, Peiquan
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 168 - 174