Targeted Syntactic Evaluation of Language Models

被引:0
|
作者
Marvin, Rebecca [1 ]
Linzen, Tal [1 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.
引用
收藏
页码:1192 / 1202
页数:11
相关论文
共 50 条
  • [1] Refining Targeted Syntactic Evaluation of Language Models
    Newman, Benjamin
    Ang, Kai-Siang
    Gong, Julia
    Hewitt, John
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3710 - 3723
  • [2] SyntaxGym: An Online Platform for Targeted Evaluation of Language Models
    Gauthier, Jon
    Hu, Jennifer
    Wilcox, Ethan
    Qian, Peng
    Levy, Roger
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 70 - 76
  • [3] Syntactic processing models and their implications for the study of language
    Ayelen Stetie, Noelia
    [J]. REVISTA DE ESTUDOS DA LINGUAGEM, 2021, 29 (03) : 2117 - 2162
  • [4] Overestimation of Syntactic Representation in Neural Language Models
    Kodner, Jordan
    Gupta, Nitish
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1757 - 1762
  • [5] Syntactic Reanalysis in Language Models for Speech Recognition
    Twiefel, Johannes
    Hinaut, Xavier
    Wermter, Stefan
    [J]. 2017 THE SEVENTH JOINT IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2017, : 215 - 220
  • [6] Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
    Beyer, Anne
    Loaiciga, Sharid
    Schlangen, David
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4164 - 4173
  • [7] A Systematic Assessment of Syntactic Generalization in Neural Language Models
    Hu, Jennifer
    Gauthier, Jon
    Qian, Peng
    Wilcox, Ethan
    Levy, Roger P.
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1725 - 1744
  • [8] A critical evaluation of micromechanical models for syntactic foams
    Bardella, Lorenzo
    Sfreddo, Alessandro
    Ventura, Carlo
    Porfiri, Maurizio
    Gupta, Nikhil
    [J]. MECHANICS OF MATERIALS, 2012, 50 : 53 - 69
  • [9] Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
    Finlayson, Matthew
    Mueller, Aaron
    Gehrmann, Sebastian
    Shieber, Stuart
    Linzen, Tal
    Belinkov, Yonatan
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1828 - 1843
  • [10] Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
    Futrell, Richard
    Wilcox, Ethan
    Morita, Takashi
    Qian, Peng
    Ballesteros, Miguel
    Levy, Roger
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 32 - 42