Refinement of Unsupervised Cross-Lingual Word Embeddings

被引:1
|
作者
Biesialska, Magdalena [1 ]
Costa-jussa, Marta R. [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain
关键词
D O I
10.3233/FAIA200317
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages by allowing to learn multilingual word representations even without using any direct bilingual signal. The lion's share of the methods are projection-based approaches that map pre-trained embeddings into a shared latent space. These methods are mostly based on the orthogonal transformation, which assumes language vector spaces to be isomorphic. However, this criterion does not necessarily hold, especially for morphologically-rich languages. In this paper, we propose a self-supervised method to refine the alignment of unsupervised bilingual word embeddings. The proposed model moves vectors of words and their corresponding translations closer to each other as well as enforces length- and center-invariance, thus allowing to better align cross-lingual embeddings. The experimental results demonstrate the effectiveness of our approach, as in most cases it outperforms state-of-the-art methods in a bilingual lexicon induction task.
引用
收藏
页码:1978 / 1981
页数:4
相关论文
共 50 条
  • [1] Interactive Refinement of Cross-Lingual Word Embeddings
    Yuan, Michelle
    Zhang, Mozhi
    Van Durme, Benjamin
    Findlater, Leah
    Boyd-Graber, Jordan
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5984 - 5996
  • [2] Cross-Lingual Word Embeddings
    Søgaard, Anders
    Vulić, Ivan
    Ruder, Sebastian
    Faruqui, Manaal
    [J]. Synthesis Lectures on Human Language Technologies, 2019, 12 (02): : 1 - 132
  • [3] A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping
    Plucinski, Kamil
    Lango, Mateusz
    Zimniewicz, Michal
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5555 - 5562
  • [4] Unsupervised cross-lingual word embeddings learning with adversarial training
    Li, Yuling
    Zhang, Yuhong
    Li, Peipei
    Hu, Xuegang
    [J]. 2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 150 - 156
  • [5] Cross-Lingual Word Embeddings
    Corro, Caio Filippo
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2019, 60 (01): : 46 - 48
  • [6] Cross-Lingual Word Embeddings
    Agirre, Eneko
    [J]. COMPUTATIONAL LINGUISTICS, 2020, 46 (01) : 245 - 248
  • [7] CLUSE: Cross-Lingual Unsupervised Sense Embeddings
    Chi, Ta-Chung
    Chen, Yun-Nung
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 271 - 281
  • [8] Fully unsupervised word translation from cross-lingual word embeddings especially for healthcare professionals
    Shweta Chauhan
    Shefali Saxena
    Philemon Daniel
    [J]. International Journal of System Assurance Engineering and Management, 2022, 13 : 28 - 37
  • [9] Fully unsupervised word translation from cross-lingual word embeddings especially for healthcare professionals
    Chauhan, Shweta
    Saxena, Shefali
    Daniel, Philemon
    [J]. INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022, 13 (SUPPL 1) : 28 - 37
  • [10] Cross-Lingual Word Embeddings for Turkic Languages
    Kuriyozov, Elmurod
    Doval, Yerai
    Gomez-Rodriguez, Carlos
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4054 - 4062