Over-Reasoning and Redundant Calculation of Large Language Models

被引:0
|
作者
Chiang, Cheng-Han [1 ]
Lee, Hung-yi [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) can solve problems step-by-step. While this chain-of-thought (CoT) reasoning boosts LLMs' performance, it is unclear if LLMs know when to use CoT and whether those CoT are always necessary to answer the question. This paper shows that LLMs tend to generate redundant calculations and reasoning on a manually constructed math QA dataset, GSM8K-Zero. GSM8K-Zero is constructed such that the questions can be answered without any calculations, but LLMs, including Llama-2 models and Claude-2, tend to generate lengthy and unnecessary calculations to answer the questions. We also conduct experiments to explain why LLMs generate redundant calculations and reasonings. GSM8K-Zero is publicly available at https://github.com/d223302/Over-Reasoning-of- LLMs and https://huggingface.co/datasets/dcml0714/GSM8K-Zero.
引用
收藏
页码:161 / 169
页数:9
相关论文
共 50 条
  • [31] Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models
    Tan, Qingyu
    Ng, Hwee Tou
    Bing, Lidong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14820 - 14835
  • [32] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Wei, Jason
    Wang, Xuezhi
    Schuurmans, Dale
    Bosma, Maarten
    Ichter, Brian
    Xia, Fei
    Chi, Ed H.
    Le, Quoc V.
    Zhou, Denny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [33] An Evaluation of Reasoning Capabilities of Large Language Models in Financial Sentiment Analysis
    Du, Kelvin
    Xing, Frank
    Mao, Rui
    Cambria, Erik
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 189 - 194
  • [34] Large Language Models lack essential metacognition for reliable medical reasoning
    Griot, Maxime
    Hemptinne, Coralie
    Vanderdonckt, Jean
    Yuksel, Demet
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [35] TIMEBENCH: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
    Chu, Zheng
    Chen, Jingchang
    Chen, Qianglong
    Yu, Weijiang
    Wang, Haotian
    Liu, Ming
    Qin, Bing
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1204 - 1228
  • [36] Reasoning in Large Language Models Through Symbolic Math Word Problems
    Gaur, Vedant
    Saunshi, Nikunj
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5889 - 5903
  • [37] VISA: Reasoning Video Object Segmentation via Large Language Models
    Yan, Cilin
    Wang, Haochen
    Yan, Shilin
    Jiang, Xiaolong
    Hu, Yao
    Kang, Guoliang
    Xie, Weidi
    Gavves, Efstratios
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 98 - 115
  • [38] The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code
    Liu, Xiao
    Yin, Da
    Zhang, Chen
    Feng, Yansong
    Zhao, Dongyan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9009 - 9022
  • [39] ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
    Zhou, Kaiwen
    Lee, Kwonjoon
    Misu, Teruhisa
    Wang, Xin Eric
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 10783 - 10795
  • [40] Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models
    Petruzzellis, Flavio
    Testolin, Alberto
    Sperduti, Alessandro
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT V, 2024, 15020 : 266 - 276