Morphosyntactic probing of multilingual BERT models

被引:2
|
作者
Acs, Judit [1 ,2 ]
Hamerlik, Endre [1 ,4 ]
Schwartz, Roy [5 ]
Smith, Noah A. [6 ,7 ]
Kornai, Andras [1 ,3 ]
机构
[1] ELKH Inst Comp Sci & Control SZTAK, Informat Lab, Budapest, Hungary
[2] Budapest Univ Technol & Econ, Fac Elect Engn & Informat, Dept Automat & Appl Informat, Budapest, Hungary
[3] Budapest Univ Technol & Econ, Fac Nat Sci, Dept Algebra, Budapest, Hungary
[4] Comenius Univ, Fac Math Phys & Informat, Dept Appl Informat, Bratislava, Slovakia
[5] Hebrew Univ Jerusalem, Sch Comp Sci & Engn, Jerusalem, Israel
[6] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA USA
[7] Allen Inst Artificial Intelligence, Seattle, WA USA
关键词
Morphology; Language Resources; Multilinguality; Machine Learning; Language Models;
D O I
10.1017/S1351324923000190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce an extensive dataset for multilingual probing of morphological information in language models (247 tasks across 42 languages from 10 families), each consisting of a sentence with a target word and a morphological tag as the desired label, derived from the Universal Dependencies treebanks. We find that pre-trained Transformer models (mBERT and XLM-RoBERTa) learn features that attain strong performance across these tasks. We then apply two methods to locate, for each probing task, where the disambiguating information resides in the input. The first is a new perturbation method that "masks" various parts of context; the second is the classical method of Shapley values. The most intriguing finding that emerges is a strong tendency for the preceding context to hold more information relevant to the prediction than the following context.
引用
收藏
页码:753 / 792
页数:40
相关论文
共 50 条
  • [1] FinEst BERT and CroSloEngual BERT Less Is More in Multilingual Models
    Ulcar, Matej
    Robnik-Sikonja, Marko
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 104 - 111
  • [2] How multilingual is Multilingual BERT?
    Pires, Telmo
    Schlinger, Eva
    Garrette, Dan
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4996 - 5001
  • [3] Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models
    Papadimitriou, Isabel
    Lopez, Kezia
    Jurafsky, Dan
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1194 - 1200
  • [4] Probing Multilingual Language Models for Discourse
    Kurfali, Murathan
    Ostling, Robert
    REPL4NLP 2021: PROCEEDINGS OF THE 6TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2021, : 8 - 19
  • [5] Probing Multilingual Cognate Prediction Models
    Fourrier, Clementine
    Sagot, Benoit
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3786 - 3801
  • [6] A multilabel approach to morphosyntactic probing
    Shapiro, Naomi Tachikawa
    Paullada, Amandalynne
    Steinert-Threlkeld, Shane
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4486 - 4524
  • [7] Evaluating Multilingual BERT for Estonian
    Kittask, Claudia
    Milintsevich, Kirill
    Sirts, Kairit
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 19 - 26
  • [8] BERT Probe: A python']python package for probing attention based robustness evaluation of BERT models
    Khan, Shahrukh
    Shahid, Mahnoor
    Singh, Navdeeppal
    SOFTWARE IMPACTS, 2022, 13
  • [9] Exploring Multilingual Word Embedding Alignments in BERT Models: A Case Study of English and Norwegian
    Aaby, Pernille
    Biermann, Daniel
    Yazidi, Anis
    Mello, Gustavo Borges Moreno e
    Palumbo, Fabrizio
    ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 47 - 58
  • [10] Probing BERT for Ranking Abilities
    Wallat, Jonas
    Beringer, Fabian
    Anand, Abhijit
    Anand, Avishek
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 255 - 273