A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

被引:0
|
作者
Stolfo, Alessandro [1 ]
Belinkov, Yonatan [2 ]
Sachan, Mrinmaya [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Technion IIT, Haifa, Israel
基金
以色列科学基金会; 瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture. In order to improve our understanding of this aspect of language models, we present a mechanistic interpretation of Transformer-based LMs on arithmetic questions using a causal mediation analysis framework. By intervening on the activations of specific model components and measuring the resulting changes in predicted probabilities, we identify the subset of parameters responsible for specific predictions. This provides insights into how information related to arithmetic is processed by LMs. Our experimental results indicate that LMs process the input by transmitting the information relevant to the query from mid-sequence early layers to the final token using the attention mechanism. Then, this information is processed by a set of MLP modules, which generate result-related information that is incorporated into the residual stream. To assess the specificity of the observed activation dynamics, we compare the effects of different model components on arithmetic queries with other tasks, including number retrieval from prompts and factual knowledge questions.(1)
引用
收藏
页码:7035 / 7052
页数:18
相关论文
共 50 条
  • [41] Benchmarking Large Language Models for Log Analysis, Security, and Interpretation
    Karlsen, Egil
    Luo, Xiao
    Zincir-Heywood, Nur
    Heywood, Malcolm
    JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2024, 32 (03)
  • [42] Counterfactual reasoning in space and time: Integrating graphical causal models in computational movement analysis
    Rahimi, Saeed
    Moore, Antoni B.
    Whigham, Peter A.
    Dillingham, Peter
    TRANSACTIONS IN GIS, 2023, 27 (07) : 1846 - 1864
  • [43] Longitudinal Mediation Analysis Using Natural Effect Models
    Mittinty, Murthy N.
    Vansteelandt, Stijn
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2020, 189 (11) : 1427 - 1435
  • [44] Variable selection for causal mediation analysis using LASSO-based methods
    Ye, Zhaoxin
    Zhu, Yeying
    Coffman, Donna L.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (06) : 1413 - 1427
  • [45] Disentangling the far transfer of language comprehension gains using latent mediation models
    Melby-Lervag, Monica
    Hagen, Aste Mjelve
    Lervag, Arne
    DEVELOPMENTAL SCIENCE, 2020, 23 (04)
  • [47] High-dimensional causal mediation analysis based on partial linear structural equation models
    Cai, Xizhen
    Zhu, Yeying
    Huang, Yuan
    Ghosh, Debashis
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 174
  • [48] Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data
    Akhtar, Mubashara
    Shankarampeta, Abhilash
    Gupta, Vivek
    Patil, Arpit
    Cocarascul, Oana
    Simper, Elena
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15391 - 15405
  • [49] Performance evaluation of large language models with chain-of-thought reasoning ability in clinical laboratory case interpretation
    Yang, He S.
    Li, Jieli
    Yi, Xin
    Wang, Fei
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2025,
  • [50] Mental Models and Computer-Based Scientific Inquiry Learning: Effects of Mechanistic Cues on Adolescent Representation and Reasoning About Causal Systems
    Danielle E. Kaplan
    John B. Black
    Journal of Science Education and Technology, 2003, 12 (4) : 483 - 493