The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation

被引:0
|
作者
Saleva, Jonne [1 ]
Lignos, Constantine [1 ]
机构
[1] Brandeis Univ, Waltham, MA 02453 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting. We compare segmentations produced by applying BPE at the token or sentence level with morphologically-based segmentations from LMVR and MORSEL. We evaluate translation tasks between English and each of Nepali, Sinhala, and Kazakh, and predict that using morphologically-based segmentation methods would lead to better performance in this setting. However, comparing to BPE, we find that no consistent and reliable differences emerge between the segmentation methods. While morphologically-based methods outperform BPE in a few cases, what performs best tends to vary across tasks, and the performance of segmentation methods is often statistically indistinguishable.
引用
收藏
页码:164 / 174
页数:11
相关论文
共 50 条
  • [41] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Singh, Salam Michael
    Singh, Thoudam Doren
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17): : 14823 - 14844
  • [42] Pseudotext Injection and Advance Filtering of Low-Resource Corpus for Neural Machine Translation
    Adjeisah, Michael
    Liu, Guohua
    Nyabuga, Douglas Omwenga
    Nortey, Richard Nuetey
    Song, Jinling
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [43] Improved neural machine translation for low-resource English-Assamese pair
    Laskar, Sahinur Rahman
    Khilji, Abdullah Faiz Ur Rahman
    Pakray, Partha
    Bandyopadhyay, Sivaji
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4727 - 4738
  • [44] STA: An efficient data augmentation method for low-resource neural machine translation
    Li, Fuxue
    Chi, Chuncheng
    Yan, Hong
    Liu, Beibei
    Shao, Mingzhi
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) : 121 - 132
  • [45] low-resource neural Machine translation with Multi-strategy prototype generation
    Yu, Zhi-Qiang
    Yu, Zheng-Tao
    Huang, Yu-Xin
    Guo, Jun-Jun
    Xian, Yan-Tuan
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (11): : 5113 - 5125
  • [46] DRA: dynamic routing attention for neural machine translation with low-resource languages
    Wang, Zhenhan
    Song, Ran
    Yu, Zhengtao
    Mao, Cunli
    Gao, Shengxiang
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [47] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Singh, Salam Michael
    Singh, Thoudam Doren
    [J]. Neural Computing and Applications, 2022, 34 (17) : 14823 - 14844
  • [48] A Joint Back-Translation and Transfer Learning Method for Low-Resource Neural Machine Translation
    Luo, Gong-Xu
    Yang, Ya-Ting
    Dong, Rui
    Chen, Yan-Hong
    Zhang, Wen-Bo
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [49] OCR Improves Machine Translation for Low-Resource Languages
    Ignat, Oana
    Maillard, Jean
    Chaudhary, Vishrav
    Guzman, Francisco
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1164 - 1174
  • [50] Decoding Strategies for Improving Low-Resource Machine Translation
    Park, Chanjun
    Yang, Yeongwook
    Park, Kinam
    Lim, Heuiseok
    [J]. ELECTRONICS, 2020, 9 (10) : 1 - 15