Do Neural Language Models Show Preferences for Syntactic Formalisms?

被引:0
|
作者
Kulmizev, Artur [1 ]
Ravishankar, Vinit [2 ]
Abdou, Mostafa [3 ]
Nivre, Joakim [1 ]
机构
[1] Uppsala Univ, Uppsala, Sweden
[2] Univ Oslo, Oslo, Norway
[3] Univ Copenhagen, Copenhagen, Denmark
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a single language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages. We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages, probing for two different syntactic annotation styles: Universal Dependencies (UD), prioritizing deep syntactic relations, and Surface-Syntactic Universal Dependencies (SUD), focusing on surface structure. We find that both models exhibit a preference for UD over SUD - with interesting variations across languages and layers - and that the strength of this preference is correlated with differences in tree shape.
引用
收藏
页码:4077 / 4091
页数:15
相关论文
共 50 条
  • [1] Overestimation of Syntactic Representation in Neural Language Models
    Kodner, Jordan
    Gupta, Nitish
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1757 - 1762
  • [2] A Systematic Assessment of Syntactic Generalization in Neural Language Models
    Hu, Jennifer
    Gauthier, Jon
    Qian, Peng
    Wilcox, Ethan
    Levy, Roger P.
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1725 - 1744
  • [3] Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
    Finlayson, Matthew
    Mueller, Aaron
    Gehrmann, Sebastian
    Shieber, Stuart
    Linzen, Tal
    Belinkov, Yonatan
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1828 - 1843
  • [4] Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
    Futrell, Richard
    Wilcox, Ethan
    Morita, Takashi
    Qian, Peng
    Ballesteros, Miguel
    Levy, Roger
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 32 - 42
  • [5] Transductive Learning of Neural Language Models for Syntactic and Semantic Analysis
    Ouchi, Hiroki
    Suzuki, Jun
    Inui, Kentaro
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3665 - 3671
  • [6] A neural syntactic language model
    Emami, A
    Jelinek, F
    [J]. MACHINE LEARNING, 2005, 60 (1-3) : 195 - 227
  • [7] A Neural Syntactic Language Model
    Ahmad Emami
    Frederick Jelinek
    [J]. Machine Learning, 2005, 60 : 195 - 227
  • [8] Do bilinguals show neural differences with monolinguals when processing their native language?
    Palomar-Garcia, Maria-Angeles
    Bueicheku, Elisenda
    Avila, Cesar
    Sanjuan, Ana
    Strijkers, Kristof
    Ventura-Campos, Noelia
    Costa, Albert
    [J]. BRAIN AND LANGUAGE, 2015, 142 : 36 - 44
  • [9] Targeted Syntactic Evaluation of Language Models
    Marvin, Rebecca
    Linzen, Tal
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1192 - 1202
  • [10] Exact training of a neural syntactic language model
    Emami, A
    Jelinek, F
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 245 - 248