Explaining pretrained language models' understanding of linguistic structures using construction grammar

被引:0
|
作者
Weissweiler, Leonie [1 ,2 ]
Hofmann, Valentin [1 ,3 ]
Koeksal, Abdullatif [1 ,2 ]
Schuetze, Hinrich [1 ,2 ]
机构
[1] Ludwig Maximilians Univ Munchen, Ctr Informat & Language Proc, Munich, Germany
[2] Munich Ctr Machine Learning, Munich, Germany
[3] Univ Oxford, Fac Linguist, Oxford, England
来源
基金
欧洲研究理事会;
关键词
NLP; probing; construction grammar; computational linguistics; large language models; COMPARATIVE CORRELATIVES;
D O I
10.3389/frai.2023.1225791
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step toward assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behavior in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs, as well as OPT, are able to recognize the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Explaining Misinformation Detection Using Large Language Models
    Pendyala, Vishnu S.
    Hall, Christopher E.
    ELECTRONICS, 2024, 13 (09)
  • [22] Explaining Contextualization in Language Models using Visual Analytics
    Sevastjanova, Rita
    Kalouli, Aikaterini-Lida
    Beck, Christin
    Schaefer, Hanna
    El-Assady, Mennatallah
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 464 - 476
  • [23] Explaining Social Recommendations Using Large Language Models
    Ashaduzzaman, Md.
    Thi Nguyen
    Tsai, Chun-Hua
    NEW TRENDS IN DISRUPTIVE TECHNOLOGIES, TECH ETHICS, AND ARTIFICIAL INTELLIGENCE, DITTET 2024, 2024, 1459 : 73 - 84
  • [24] Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View
    Cao, Boxi
    Lin, Hongyu
    Han, Xianpei
    Liu, Fangchao
    Sun, Le
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5796 - 5808
  • [25] Long Document Summarization in a Low Resource Setting using Pretrained Language Models
    Bajaj, Ahsaas
    Dangati, Pavitra
    Krishna, Kalpesh
    Kumar, Pradhiksha Ashok
    Uppaal, Rheeya
    Windsor, Bradford
    Brenner, Eliot
    Dotterrer, Dominic
    Das, Rajarshi
    McCallum, Andrew
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 71 - 80
  • [26] An Empirical Evaluation of Out-of-Distribution Detection Using Pretrained Language Models
    Yoon, Byungmu
    Kim, Jaeyoung
    2023 5TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2023, : 302 - 308
  • [27] Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
    Herzig, Roei
    Mendelson, Alon
    Karlinsky, Leonid
    Arbelle, Assaf
    Feris, Rogerio
    Darre, Trevor
    Globerson, Amir
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14077 - 14098
  • [28] Cross-linguistic influence in non-native languages: explaining lexical transfer using language production models
    Burton, Graham
    INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2013, 10 (01) : 46 - 59
  • [29] PLENARY: Explaining black-box models in natural language through fuzzy linguistic summaries
    Kaczmarek-Majer, Katarzyna
    Casalino, Gabriella
    Castellano, Giovanna
    Dominiak, Monika
    Hryniewicz, Olgierd
    Kaminska, Olga
    Vessio, Gennaro
    Diaz-Rodriguez, Natalia
    INFORMATION SCIENCES, 2022, 614 : 374 - 399
  • [30] Using Large Pretrained Language Models for Answering User Queries from Product Specifications
    Roy, Kalyani
    Shah, Smit
    Pai, Nithish
    Ramtej, Jaidam
    Nadkarn, Prajit Prashant
    Banerjee, Jyotirmoy
    Goyal, Pawan
    Kumar, Surender
    WORKSHOP ON E-COMMERCE AND NLP (ECNLP 3), 2020, : 35 - 39