Improving neural machine translation for low-resource Indian languages using rule-based feature extraction

被引:0
|
作者
Muskaan Singh
Ravinder Kumar
Inderveer Chana
机构
[1] Thapar Institute of Engineering and Technology (TIET),Language Engineering and Machine Learning Research Labs, CSED
[2] CSED,undefined
[3] Thapar Institute of Engineering and Technology (TIET),undefined
来源
关键词
Recurrent neural network; Linguistic feature extraction; Deep learning; Rule-based system; Sanskrit–Hindi translation;
D O I
暂无
中图分类号
学科分类号
摘要
Languages help to unite the world socially, culturally and technologically. Different natives communicate in different languages; there is a tremendous requirement for inter-language information translation process to transfer and share information and ideas. Though Sanskrit is an ancient Indo-European language, a significant amount of work for processing the information is required to explore the full potential of this language to open vistas in computational linguistics and computer science domain. In this paper, we have proposed and presented the machine translation system for translating Sanskrit to the Hindi language. The developed technique uses linguistic features from rule-based feed to train neural machine translation system. The work is novel and applicable to any low-resource language with rich morphology. It is a generic system covering various domains with minimal human intervention. The performance analysis of work is performed on automatic and linguistic measures. The results show that proposed and developed approach outperforms earlier work for this language pair.
引用
收藏
页码:1103 / 1122
页数:19
相关论文
共 50 条
  • [1] Improving neural machine translation for low-resource Indian languages using rule-based feature extraction
    Singh, Muskaan
    Kumar, Ravinder
    Chana, Inderveer
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (04): : 1103 - 1122
  • [2] Improving Neural Machine Translation Using Rule-Based Machine Translation
    Singh, Muskaan
    Kumar, Ravinder
    Chana, Inderveer
    [J]. 2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 8 - 12
  • [3] Neural Machine Translation for Low-resource Languages: A Survey
    Ranathunga, Surangika
    Lee, En-Shiun Annie
    Skenduli, Marjana Prifti
    Shekhar, Ravi
    Alam, Mehreen
    Kaur, Rishemjit
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (11)
  • [4] Machine Translation in Low-Resource Languages by an Adversarial Neural Network
    Sun, Mengtao
    Wang, Hao
    Pasquine, Mark
    Hameed, Ibrahim A.
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [5] Extremely low-resource neural machine translation for Asian languages
    Rubino, Raphael
    Marie, Benjamin
    Dabre, Raj
    Fujita, Atushi
    Utiyama, Masao
    Sumita, Eiichiro
    [J]. MACHINE TRANSLATION, 2020, 34 (04) : 347 - 382
  • [6] Neural Machine Translation of Low-Resource and Similar Languages with Backtranslation
    Przystupa, Michael
    Abdul-Mageed, Muhammad
    [J]. FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), VOL 3: SHARED TASK PAPERS, DAY 2, 2019, : 224 - 235
  • [7] Recent advances in Apertium, a free/open-source rule-based machine translation platform for low-resource languages
    Khanna, Tanmai
    Washington, Jonathan N.
    Tyers, Francis M.
    Bayatli, Sevilay
    Swanson, Daniel G.
    Pirinen, Tommi A.
    Tang, Irene
    Font, Hector Alos i
    [J]. MACHINE TRANSLATION, 2021, 35 (04) : 475 - 502
  • [8] Neighbors helping the poor: improving low-resource machine translation using related languages
    Pourdamghani, Nima
    Knight, Kevin
    [J]. MACHINE TRANSLATION, 2019, 33 (03) : 239 - 258
  • [9] Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages
    Duh, Kevin
    McNamee, Paul
    Post, Matt
    Thompson, Brian
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2667 - 2675
  • [10] Towards a Low-Resource Neural Machine Translation for Indigenous Languages in Canada
    Ngoc Tan Le
    Sadat, Fatiha
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2021, 62 (03): : 39 - 63