Unsupervised Alignment of Distributional Word Embeddings

被引:0
|
作者
Diallo, Aissatou [1 ]
Fuernkranz, Johannes [2 ]
机构
[1] UCL, Dept Comp Sci, London, England
[2] Johannes Kepler Univ Linz, Computat Data Analyt, FAW, Linz, Austria
关键词
Unsupervised alignment; Distributional embeddings; Word translation;
D O I
10.1007/978-3-031-15791-2_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-domain alignment plays a key role in tasks ranging from image-text retrieval to machine translation. The main objective is to associate related entities across different domains. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. This paper investigates a novel stochastic optimization approach for aligning word distributional embeddings. Our method builds upon techniques in optimal transport to resolve the cross-domain matching problem in a principled manner. We evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We present empirical evidence to demonstrate the validity of our approach to the bilingual lexicon induction task across several language pairs.
引用
收藏
页码:60 / 74
页数:15
相关论文
共 50 条
  • [1] Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings
    Jawanpuria, Pratik
    Meghwanshi, Mayank
    Mishra, Bamdev
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3052 - 3058
  • [2] Unsupervised Multilingual Word Embeddings
    Chen, Xilun
    Cardie, Claire
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 261 - 270
  • [3] Unsupervised Alignment of Embeddings with Wasserstein Procrustes
    Grave, Edouard
    Joulin, Armand
    Berthet, Quentin
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [4] Unsupervised Word Sense Disambiguation Using Word Embeddings
    Moradi, Behzad
    Ansari, Ebrahim
    Zabokrtsky, Zdenek
    [J]. PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 228 - 233
  • [5] Learning Word Embeddings in Parallel by Alignment
    Zubair, Sahil
    Zubair, Mohammad
    [J]. 2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 566 - 571
  • [6] GLOSS ALIGNMENT USING WORD EMBEDDINGS
    Walsh, Harry
    Sincan, Ozge Mercanoglu
    Saunders, Ben
    Bowden, Richard
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [7] Integrating Word Embeddings into IBM Word Alignment Models
    Anh-Cuong Le
    Tuan-Phong Nguyen
    Quoc-Long Tran
    [J]. PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 79 - 84
  • [8] Word Embeddings for Unsupervised Named Entity Linking
    Nozza, Debora
    Sas, Cezar
    Fersini, Elisabetta
    Messina, Enza
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 115 - 132
  • [9] Distributional memory explainable word embeddings in continuous space
    Snidaro, Lauro
    Ferrin, Giovanni
    Foresti, Gian Luca
    [J]. 2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
  • [10] Incorporating word embeddings in unsupervised morphological segmentation
    Ustun, Ahmet
    Can, Burcu
    [J]. NATURAL LANGUAGE ENGINEERING, 2021, 27 (05) : 609 - 629