Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning of Large Language Models

被引:0
|
作者
Wang, Dingzirui [1 ]
Dou, Longxu [1 ]
Zhang, Wenbin [2 ]
Zeng, Junyu [2 ]
Che, Wanxiang [1 ]
机构
[1] Harbin Inst Technol, Harbin, Peoples R China
[2] Yunfu Technol Beijing Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Numerical reasoning is a vital capability for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions and then generate answers. Current SOTA methods generate programs as IMRs with large language models (LLMs). Intuitively, equations have fewer restrictions and closer semantics to the question than programs, leading to higher generation accuracy. However, current LLMs generate equations worse than programs, where we assume that the equation data is rare in pre -training data compared to programs. So in this paper, we try to use equations as IMRs to solve the numerical reasoning task by addressing two problems: (1) Theoretically, how to prove that the equation is an IMR with higher generation accuracy than programs; (2) Empirically, how to improve the generation accuracy of equations with LLMs. For the first problem, we propose and prove a proposition to theoretically compare the generation accuracy of different IMRs. For the second problem, we present a method called Boosting Numerical Reasoning by Decomposing the Generation of Equations (BRIDGE), which can improve the accuracy of LLMs in generating equations as IMRs by reducing the tendency of generating constant expressions and programs. Our method improves the performance by 2.2%, 0.9%, and 1.7% on GSM8K, SVAMP, and Algebra datasets compared to the previous state-of-the-art methods under the single reasoning path setting. Our code and prompts are available at hfips://github.com/ziruiHIT/Bridge_for_NumericaLReasoning.
引用
收藏
页码:19116 / 19125
页数:10
相关论文
共 50 条
  • [21] The use of large language models as scaffolds for proleptic reasoning
    Olya Kudina
    Brian Ballsun-Stanton
    Mark Alfano
    Asian Journal of Philosophy, 4 (1):
  • [22] The Impact of Reasoning Step Length on Large Language Models
    Jin, Mingyu
    Yu, Qinkai
    Dong, Shu
    Zhao, Haiyan
    Hua, Wenyue
    Meng, Yanda
    Zhang, Yongfeng
    Du, Mengnan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1830 - 1842
  • [23] TRAM: Benchmarking Temporal Reasoning for Large Language Models
    Wang, Yuqing
    Zhao, Yun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 6389 - 6415
  • [24] EconNLI: Evaluating Large Language Models on Economics Reasoning
    Guo, Yue
    Yang, Yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 982 - 994
  • [25] Evaluating Large Language Models for Tax Law Reasoning
    Cavalcante Presa, Joao Paulo
    Camilo Junior, Celso Goncalves
    Teles de Oliveira, Savio Salvarino
    INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 460 - 474
  • [26] Automatic Model Selection with Large Language Models for Reasoning
    Zhao, James Xu
    Xie, Yuxi
    Kawaguchi, Kenji
    He, Junxian
    Xie, Michael Qizhe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
  • [27] NEWTON: Are Large Language Models Capable of Physical Reasoning?
    Wang, Yi Ru
    Du, Jiafei
    Fox, Dieter
    Srinivasa, Siddhartha
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9743 - 9758
  • [28] Dynamic Voting for Efficient Reasoning in Large Language Models
    Xue, Mingfeng
    Liu, Dayiheng
    Lei, Wenqiang
    Ren, Xingzhang
    Yang, Baosong
    Xie, Jun
    Zhang, Yidan
    Peng, Dezhong
    Lv, Jiancheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3085 - 3104
  • [29] Reasoning with large language models for medical question answering
    Lucas, Mary M.
    Yang, Justin
    Pomeroy, Jon K.
    Yang, Christopher C.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)
  • [30] Rationality of Thought Improves Reasoning in Large Language Models
    Gou, Tian
    Zhang, Boyao
    Sun, Zhenglie
    Wang, Jing
    Liu, Fang
    Wang, Yangang
    Wang, Jue
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2024, 2024, 14887 : 343 - 358