Analysis of Neural Machine Translation KANGRI Language by Unsupervised and Semi Supervised Methods

被引：9

作者：

Chauhan, Shweta ^{[1
]}

Saxena, Shefali ^{[1
]}

Daniel, Philemon ^{[1
]}

机构：

[1] Natl Inst Technol, Dept Elect & Commun, Hamirpur 177005, Himachal Prades, India

来源：

IETE JOURNAL OF RESEARCH | 2023年 / 69卷 / 10期

关键词：

Machine translation; Low resource language; Unsupervised techniques; Semi supervised techniques; Cross-lingual word embedding;

D O I：

10.1080/03772063.2021.2016506

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

It is very challenging to work with low resource languages pairs as monolingual and parallel dataset do not exist or exist in a very small amount. Furthermore, there is a lack of digitization of the available written resources. This work provides a comparison and analysis of the neural machine translation system for low resource definitely endangered, Kangri (ISO 639-3xnr) language using unsupervised and semi supervised methods. For this a shared encoder with back translation machine translation system for both unsupervised and semi-supervised learning techniques and a language model with denoising autoencoder that uses fully unsupervised learning technique has been used. Kangri which is an Indo-Aryan language has Devanagari () script same as Hindi. The translation task is further complicated by the fact that Kangri is a morphologically rich language, and it does not have well defined linguistic rules. To remove out of vocabulary problem we have used different technique and in finally, we have provided the comparison of results by taking the different evaluation metrics which shows that semi supervised translation with semi supervised cross lingual word embedding has highest score as compared to other translation models.

引用

页码：6867 / 6877

页数：11

共 50 条

[41] Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems
Marie, Benjamin
Fujita, Atsushi
[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
[42] Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation
Guerreiro, Nuno M.
Colombo, Pierre
Piantanida, Pablo
Martins, Andre F. T.
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13766 - 13784
[43] Semi-supervised Cop-Kmeans clustering analysis of language culture in English translation
Yu, Haiyan
Su, Tao
[J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 161 - 161
[44] Towards Explainable Formal Methods: From LTL to Natural Language with Neural Machine Translation
Cherukuri, Himaja
Ferrari, Alessio
Spoletini, Paola
[J]. REQUIREMENTS ENGINEERING: FOUNDATION FOR SOFTWARE QUALITY, REFSQ 2022, 2022, 13216 : 79 - 86
[45] SEMI-SUPERVISED LEARNING OF LANGUAGE MODEL USING UNSUPERVISED TOPIC MODEL
Bai, Shuanhu
Huang, Chien-Lin
Ma, Bin
Li, Haizhou
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5382 - 5385
[46] A Framework for the Unsupervised and Semi-Supervised Analysis of Visual Frames
Torres, Michelle
[J]. POLITICAL ANALYSIS, 2024, 32 (02) : 199 - 220
[47] Classification of lidar measurements using supervised and unsupervised machine learning methods
Farhani, Ghazal
Sica, Robert J.
Daley, Mark Joseph
[J]. ATMOSPHERIC MEASUREMENT TECHNIQUES, 2021, 14 (01) : 391 - 402
[48] Evaluating Explanation Methods for Neural Machine Translation
Li, Jierui
Liu, Lemao
Li, Huayang
Li, Guanlin
Huang, Guoping
Shi, Shuming
[J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 365 - 375
[49] An empirical analysis on statistical and neural machine translation system for English to Mizo language
Devi C.S.
Purkayastha B.S.
[J]. International Journal of Information Technology, 2023, 15 (8) : 4021 - 4028
[50] From unsupervised to semi-supervised anomaly detection methods for HRRP targets
Bauw, Martin
Velasco-Forero, Santiago
Angulo, Jesus
Adnet, Claude
Airiau, Olivier
[J]. 2020 IEEE RADAR CONFERENCE (RADARCONF20), 2020,

← 1 2 3 4 5 →