Towards Analysis and Interpretation of Large Language Models for Arithmetic Reasoning

被引:0
|
作者
Akter, Mst Shapna [1 ]
Shahriar, Hossain [2 ]
Cuzzocrea, Alfredo [3 ,4 ]
机构
[1] Univ West Florida, Dept Intelligent Syst & Robot, Pensacola, FL 32514 USA
[2] Univ West Florida, Ctr Cybersecur, Pensacola, FL USA
[3] Univ Calabria, iDEA Lab, Arcavacata Di Rende, Italy
[4] Univ Paris City, Dept Comp Sci, Paris, France
关键词
LLMs; Arithmetic Reasoning; Causal Mediation Analysis;
D O I
10.1109/SDS60720.2024.00049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have recently conquered the research scene, with particular regards to the Transformer architecture in the context of arithmetic reasoning. In this so-delineated scenario, this paper puts the basis for a causal mediation analysis about the approach of Transformer-based LLMs to complex arithmetic problems. In particular, we try to discover which parameters are crucial for complex reasoning tasks such as model activations. Our preliminary results state that, for complex arithmetic operations, information is channeled from mid-layer activations to the final token through enhanced attention mechanisms. Preliminary experiments are reported.
引用
收藏
页码:267 / 270
页数:4
相关论文
共 50 条
  • [41] Reasoning with Large Language Models on Graph Tasks: The Influence of Temperature
    Wang, Yiming
    Zhang, Ziyang
    Chen, Hanwei
    Shen, Huayi
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 630 - 634
  • [42] Over-Reasoning and Redundant Calculation of Large Language Models
    Chiang, Cheng-Han
    Lee, Hung-yi
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 161 - 169
  • [43] Exploring Reversal Mathematical Reasoning Ability for Large Language Models
    Guo, Pei
    You, Wangjie
    Li, Juntao
    Yan, Bowen
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13671 - 13685
  • [44] Towards developing probabilistic generative models for reasoning with natural language representations
    Marcu, D
    Popescu, AM
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 88 - 99
  • [45] Understanding Social Reasoning in Language Models with Language Models
    Gandhi, Kanishk
    Franken, J. -Philipp
    Gerstenberg, Tobias
    Goodman, Noah D.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Towards an Explorable Conceptual Map of Large Language Models
    Bertetto, Lorenzo
    Bettinelli, Francesca
    Buda, Alessio
    Da Mommio, Marco
    Di Bari, Simone
    Savelli, Claudio
    Baralis, Elena
    Bernasconi, Anna
    Cagliero, Luca
    Ceri, Stefano
    Pierri, Francesco
    INTELLIGENT INFORMATION SYSTEMS, CAISE FORUM 2024, 2024, 520 : 82 - 90
  • [47] Towards the holistic design of alloys with large language models
    Pei, Zongrui
    Yin, Junqi
    Neugebauer, Joerg
    Jain, Anubhav
    NATURE REVIEWS MATERIALS, 2024, 9 (12): : 840 - 841
  • [48] Towards Concept-Aware Large Language Models
    Shani, Chen
    Vreeken, Jilles
    Shahaf, Dafna
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13158 - 13170
  • [49] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Wei, Jason
    Wang, Xuezhi
    Schuurmans, Dale
    Bosma, Maarten
    Ichter, Brian
    Xia, Fei
    Chi, Ed H.
    Le, Quoc V.
    Zhou, Denny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [50] Large Language Models lack essential metacognition for reliable medical reasoning
    Griot, Maxime
    Hemptinne, Coralie
    Vanderdonckt, Jean
    Yuksel, Demet
    NATURE COMMUNICATIONS, 2025, 16 (01)