MultiMirror: Neural Cross-lingual Word Alignment for MultilingualWord Sense Disambiguation

被引:0
|
作者
Procopio, Luigi [1 ]
Barba, Edoardo [1 ]
Martelli, Federico [1 ]
Navigli, Roberto [1 ]
机构
[1] Sapienza Univ Rome, Dept Comp Sci, Sapienza NLP Grp, Rome, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Sense Disambiguation (WSD), i.e., the task of assigning senses to words in context, has seen a surge of interest with the advent of neural models and a considerable increase in performance up to 80% F1 in English. However, when considering other languages, the availability of training data is limited, which hampers scaling WSD to many languages. To address this issue, we put forward MULTIMIRROR, a sense projection approach for multilingual WSD based on a novel neural discriminative model for word alignment: given as input a pair of parallel sentences, our model - trained with a low number of instances - is capable of jointly aligning, at the same time, all source and target tokens with each other, surpassing its competitors across several language combinations. We demonstrate that projecting senses from English by leveraging the alignments produced by our model leads a simple mBERT-powered classifier to achieve a new state of the art on established WSD datasets in French, German, Italian, Spanish and Japanese. We release our software and all our datasets at https://github.com/SapienzaNLP/multimirror.
引用
收藏
页码:3915 / 3921
页数:7
相关论文
共 50 条
  • [1] Cross-Lingual Word Sense Clustering for Sense Disambiguation
    Casteleiro, Joao
    da Silva, Joaquim Ferreira
    Lopes, Gabriel Pereira
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 : 747 - 758
  • [2] Cross-Lingual Word Sense Disambiguation for Languages with Scarce Resources
    Sarrafzadeh, Bahareh
    Yakovets, Nikolay
    Cercone, Nick
    An, Aijun
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 6657 : 347 - 358
  • [3] Choosing the best dictionary for Cross-Lingual Word Sense Disambiguation
    Duque, Andres
    Martinez-Romo, Juan
    Araujo, Lourdes
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 81 : 65 - 75
  • [4] Construction of a Benchmark Data Set for Cross-lingual Word Sense Disambiguation
    Lefever, Els
    Hoste, Veronique
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1584 - 1590
  • [5] A Discriminative Neural Model for Cross-Lingual Word Alignment
    Stengel-Estrin, Elias
    Su, Tzu-Ray
    Post, Matt
    Van Durme, Benjamin
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 910 - 920
  • [6] Cross-lingual Visual Verb Sense Disambiguation
    Gella, Spandana
    Elliott, Desmond
    Keller, Frank
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1998 - 2004
  • [7] A Naive Bayes Approach to Cross-Lingual Word Sense Disambiguation and Lexical Substitution
    Pinto, David
    Vilarino, Darnes
    Balderas, Carlos
    Tovar, Mireya
    Beltran, Beatriz
    [J]. ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 352 - 361
  • [8] Standard Test Collection for English-Persian Cross-Lingual Word Sense Disambiguation
    Rekabsaz, Navid
    Sabetghadam, Serwah
    Lupu, Mihai
    Andersson, Linda
    Hanbury, Allan
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4176 - 4179
  • [9] Ontology-supported text classification based on cross-lingual word sense disambiguation
    Tufis, Dan
    Koeva, Svetla
    [J]. APPLICATIONS OF FUZZY SETS THEORY, 2007, 4578 : 447 - +
  • [10] SBFC: An Efficient Feature Frequency-Based Approach to Tackle Cross-Lingual Word Sense Disambiguation
    Mourisse, Dieter
    Lefever, Els
    Verbiest, Nele
    Saeys, Yvan
    De Cock, Martine
    Cornelis, Chris
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 248 - 255