Analysis of Neural Machine Translation KANGRI Language by Unsupervised and Semi Supervised Methods

被引:9
|
作者
Chauhan, Shweta [1 ]
Saxena, Shefali [1 ]
Daniel, Philemon [1 ]
机构
[1] Natl Inst Technol, Dept Elect & Commun, Hamirpur 177005, Himachal Prades, India
关键词
Machine translation; Low resource language; Unsupervised techniques; Semi supervised techniques; Cross-lingual word embedding;
D O I
10.1080/03772063.2021.2016506
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
It is very challenging to work with low resource languages pairs as monolingual and parallel dataset do not exist or exist in a very small amount. Furthermore, there is a lack of digitization of the available written resources. This work provides a comparison and analysis of the neural machine translation system for low resource definitely endangered, Kangri (ISO 639-3xnr) language using unsupervised and semi supervised methods. For this a shared encoder with back translation machine translation system for both unsupervised and semi-supervised learning techniques and a language model with denoising autoencoder that uses fully unsupervised learning technique has been used. Kangri which is an Indo-Aryan language has Devanagari () script same as Hindi. The translation task is further complicated by the fact that Kangri is a morphologically rich language, and it does not have well defined linguistic rules. To remove out of vocabulary problem we have used different technique and in finally, we have provided the comparison of results by taking the different evaluation metrics which shows that semi supervised translation with semi supervised cross lingual word embedding has highest score as compared to other translation models.
引用
收藏
页码:6867 / 6877
页数:11
相关论文
共 50 条
  • [1] Semi-Supervised Learning for Neural Machine Translation
    Cheng, Yong
    Xu, Wei
    He, Zhongjun
    He, Wei
    Wu, Hua
    Sun, Maosong
    Liu, Yang
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1965 - 1974
  • [2] Reference Language based Unsupervised Neural Machine Translation
    Li, Zuchao
    Zhao, Hai
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4151 - 4162
  • [3] Using Neural Machine Translation Methods for Sign Language Translation
    Angelova, Galina
    Avramidis, Eleftherios
    Moeller, Sebastian
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 273 - 284
  • [4] Exploring Supervised and Unsupervised Rewards in Machine Translation
    Ive, Julia
    Wang, Zixu
    Fomicheva, Marina
    Specia, Lucia
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1908 - 1920
  • [5] Unsupervised dialectal neural machine translation
    Farhan, Wael
    Talafha, Bashar
    Abuammar, Analle
    Jaikat, Ruba
    Al-Ayyoub, Mahmoud
    Tarakji, Ahmad Bisher
    Toma, Anas
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
  • [6] Semi-Supervised Neural Machine Translation via Marginal Distribution Estimation
    Wang, Yijun
    Xia, Yingce
    Zhao, Li
    Bian, Jiang
    Qin, Tao
    Chen, Enhong
    Liu, Tie-Yan
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (10) : 1564 - 1576
  • [7] Dual Reconstruction: a Unifying Objective for Semi-Supervised Neural Machine Translation
    Xu, Weijia
    Niu, Xing
    Carpuat, Marine
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2006 - 2020
  • [8] English-Manipuri Machine Translation: An empirical study of different Supervised and Unsupervised Methods
    Singh, Telem Joyson
    Singh, Sanasam Ranbir
    Sarmah, Priyankoo
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 142 - 147
  • [9] Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
    Chronopoulou, Alexandra
    Stojanovski, Dario
    Fraser, Alexander
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 173 - 180
  • [10] Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement
    Sun, Haipeng
    Wang, Rui
    Chen, Kehai
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Tiejun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1170 - 1182