Morphosyntactic probing of multilingual BERT models

被引:2
|
作者
Acs, Judit [1 ,2 ]
Hamerlik, Endre [1 ,4 ]
Schwartz, Roy [5 ]
Smith, Noah A. [6 ,7 ]
Kornai, Andras [1 ,3 ]
机构
[1] ELKH Inst Comp Sci & Control SZTAK, Informat Lab, Budapest, Hungary
[2] Budapest Univ Technol & Econ, Fac Elect Engn & Informat, Dept Automat & Appl Informat, Budapest, Hungary
[3] Budapest Univ Technol & Econ, Fac Nat Sci, Dept Algebra, Budapest, Hungary
[4] Comenius Univ, Fac Math Phys & Informat, Dept Appl Informat, Bratislava, Slovakia
[5] Hebrew Univ Jerusalem, Sch Comp Sci & Engn, Jerusalem, Israel
[6] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA USA
[7] Allen Inst Artificial Intelligence, Seattle, WA USA
关键词
Morphology; Language Resources; Multilinguality; Machine Learning; Language Models;
D O I
10.1017/S1351324923000190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce an extensive dataset for multilingual probing of morphological information in language models (247 tasks across 42 languages from 10 families), each consisting of a sentence with a target word and a morphological tag as the desired label, derived from the Universal Dependencies treebanks. We find that pre-trained Transformer models (mBERT and XLM-RoBERTa) learn features that attain strong performance across these tasks. We then apply two methods to locate, for each probing task, where the disambiguating information resides in the input. The first is a new perturbation method that "masks" various parts of context; the second is the classical method of Shapley values. The most intriguing finding that emerges is a strong tendency for the preceding context to hold more information relevant to the prediction than the following context.
引用
收藏
页码:753 / 792
页数:40
相关论文
共 50 条
  • [31] Emotion recognition in Hindi text using multilingual BERT transformer
    Kumar, Tapesh
    Mahrishi, Mehul
    Sharma, Girish
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (27) : 42373 - 42394
  • [32] What's so special about BERT's layers? A closer look at the NLP pipeline in monolingual and multilingual models
    de Vries, Wietse
    van Cranenburgh, Andreas
    Nissim, Malvina
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4339 - 4350
  • [33] Some Morphosyntactic Features of Russian as a Heritage Language in the Speech of Bi/Multilingual Children in Canada
    Makarova, Veronika
    Terekhova, Natalia
    HERITAGE LANGUAGE JOURNAL, 2021, 18 (01): : 36 - 36
  • [34] LINSPECTOR: Multilingual Probing Tasks for Word Representations
    Sahin, Goezde Guel
    Vania, Clara
    Kuznetsov, Ilia
    Gurevych, Iryna
    COMPUTATIONAL LINGUISTICS, 2020, 46 (02) : 335 - 385
  • [35] Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study
    Kulshreshtha, Saurabh
    Redondo-Garcia, Jose Luis
    Chang, Ching-Yun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 933 - 942
  • [36] Unsupervised Word Segmentation with BERT Oriented Probing and Transformation
    Li, Wei
    Song, Yuhan
    Su, Qi
    Shao, Yanqiu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3935 - 3940
  • [37] Probing BERT's priors with serial reproduction chains
    Yamakoshi, Takateru
    Griffiths, Thomas L.
    Hawkins, Robert D.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3977 - 3992
  • [38] Multilingual BERT-basedWord Alignment By Incorporating Common Chinese Characters
    Li, Zezhong
    Sun, Xiao
    Ren, Fuji
    Ma, Jianjun
    Huang, Degen
    Shi, Piao
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
  • [39] What does BERT know about books, movies and music? Probing BERT for Conversational Recommendation
    Penha, Gustavo
    Hauff, Claudia
    RECSYS 2020: 14TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2020, : 388 - 397
  • [40] Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
    Papadimitriou, Isabel
    Chi, Ethan A.
    Futrell, Richard
    Mahowald, Kyle
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2522 - 2532