Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness

被引:0
|
作者
Luo, Mengting [1 ,2 ]
He, Linchao [1 ,2 ]
Guo, Mingyue [1 ,2 ]
Han, Fei [1 ,2 ]
Tian, Long [1 ,2 ]
Pu, Haibo [1 ,2 ]
Zhang, Dejun [3 ]
机构
[1] Sichuan Agr Univ, Lab Agr Informat Engn, Yaan 0086625014, Peoples R China
[2] Key Lab Agr Informat Engn Sichuan Prov, Yaan 0086625014, Peoples R China
[3] China Univ Geosci, Fac Informat Engn, Wuhan 0086430074, Hubei, Peoples R China
来源
2019 THE 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, CONTROL AND ROBOTICS (EECR 2019) | 2019年 / 533卷
基金
中国国家自然科学基金;
关键词
D O I
10.1088/1757-899X/533/1/012051
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nearest neighbor search is playing a critical role in machine word translation, due to its ability to obtain the lingual labels of source word embeddings by searching k Nearest Neighbor (k NN) target embeddings from a shared bilingual semantic space. However, aligning two language distributions into a shared space usually requires amounts of target label, and k NN retrieval causes hubness problem in high-dimensions feature space. Although most the best-k retrievals get rid of hubs in the list of translation candidates to mitigate the hubness problem, it is flawed to eliminate hubs. Because hub also has a correct source word query corresponding to it and should not be crudely excluded. In this paper, we introduce an unsupervised machine word translation model based on Generative Adversarial Nets (GANs) with Bilingual Similarity retrieval, namely, Unsupervised-BSMWT. Our model addresses three main challenges: (1) reduce the dependence of parallel data with GANs in a fully unsupervised way. (2) Significantly decrease the training time of adversarial game. (3) Propose a novel Bilingual Similarity retrieval for mitigating hubness pollution regardless of whether it is a hub. Our model efficiently performs competitive results in 74min exceeding previous GANs-based models.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] On“Word-to-Word”Translation Method Problem Caused by thinking Set in Translation
    陈思孜
    海外英语, 2019, (02) : 74 - 74
  • [2] Reconstructed similarity for faster GANs-based word translation to mitigate hubness
    Zhang, Dejun
    Luo, Mengting
    He, Fazhi
    NEUROCOMPUTING, 2019, 362 : 83 - 93
  • [3] Syllable-to-Syllable and Word-to-Word Transducers for Burmese Dialect Translation
    Oo, Thazin Myint
    Tanprasert, Thitipong
    Thu, Ye Kyaw
    Supnithi, Thepchai
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [4] Lemon and Tea Are Not Similar: Measuring Word-to-Word Similarity by Combining Different Methods
    Banjade, Rajendra
    Maharjan, Nabin
    Niraula, Nobal B.
    Rus, Vasile
    Gautam, Dipesh
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 335 - 346
  • [5] Bilingual Word Embedding with Sentence Similarity Constraint for Machine Translation
    Wu, Kui
    Wang, Xuancong
    Aw, AiTi
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 119 - 122
  • [6] SOME PECULARITIES OF THE UNRECOGNIZED WORD RETRIEVAL IN THE MACHINE LANGUAGE TRANSLATION
    KOROSTELEV, LY
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1985, (04): : 23 - 28
  • [7] Improving semantic similarity retrieval with word embeddings
    Yan, Fengqi
    Fan, Qiaoqing
    Lu, Mingming
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (23):
  • [8] WORD BASED MACHINE TRANSLATION SYSTEM
    谢金宝
    孙岗
    杨振宇
    JournalofShanghaiJiaotongUniversity, 1999, (02) : 104 - 108
  • [9] Research on English-Chinese machine translation shift based on word vector similarity
    Ma, Qingqing
    ARTIFICIAL LIFE AND ROBOTICS, 2024, 29 (04) : 585 - 589
  • [10] DETERMINATION OF WORD CLASSES AND TRANSLATION RULES IN MACHINE TRANSLATION
    KIYONO, T
    MIYAMOTO, M
    ELECTRONICS & COMMUNICATIONS IN JAPAN, 1967, 50 (02): : 68 - &