Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

被引:0
|
作者
Finlayson, Matthew [1 ]
Mueller, Aaron [2 ]
Gehrmann, Sebastian [3 ]
Shieber, Stuart [1 ]
Linzen, Tal [4 ]
Belinkov, Yonatan [5 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
[2] Johns Hopkins Univ, Baltimore, MD USA
[3] Google Res, New York, NY USA
[4] NYU, New York, NY USA
[5] Technion IIT, Haifa, Israel
基金
以色列科学基金会; 美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes-notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure.
引用
收藏
页码:1828 / 1843
页数:16
相关论文
共 50 条
  • [1] Transductive Learning of Neural Language Models for Syntactic and Semantic Analysis
    Ouchi, Hiroki
    Suzuki, Jun
    Inui, Kentaro
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3665 - 3671
  • [2] Overestimation of Syntactic Representation in Neural Language Models
    Kodner, Jordan
    Gupta, Nitish
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1757 - 1762
  • [3] A Systematic Assessment of Syntactic Generalization in Neural Language Models
    Hu, Jennifer
    Gauthier, Jon
    Qian, Peng
    Wilcox, Ethan
    Levy, Roger P.
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1725 - 1744
  • [4] Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
    Futrell, Richard
    Wilcox, Ethan
    Morita, Takashi
    Qian, Peng
    Ballesteros, Miguel
    Levy, Roger
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 32 - 42
  • [5] Do Neural Language Models Show Preferences for Syntactic Formalisms?
    Kulmizev, Artur
    Ravishankar, Vinit
    Abdou, Mostafa
    Nivre, Joakim
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4077 - 4091
  • [6] y An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models
    Noji, Hiroshi
    Takamura, Hiroya
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3375 - 3385
  • [7] A neural syntactic language model
    Emami, A
    Jelinek, F
    [J]. MACHINE LEARNING, 2005, 60 (1-3) : 195 - 227
  • [8] A Neural Syntactic Language Model
    Ahmad Emami
    Frederick Jelinek
    [J]. Machine Learning, 2005, 60 : 195 - 227
  • [9] Syntactic analysis of the sentences of the Russian language based on neural networks
    Sboev, A. G.
    Rybka, R.
    Moloshnikov, I.
    Gudovskih, D.
    [J]. 4TH INTERNATIONAL YOUNG SCIENTIST CONFERENCE ON COMPUTATIONAL SCIENCE, 2015, 66 : 277 - 286
  • [10] Agreement and movement: A syntactic analysis of attraction
    Franck, Julie
    Lassi, Glenda
    Frauenfelder, Ulrich H.
    Rizzi, Luige
    [J]. COGNITION, 2006, 101 (01) : 173 - 216