The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation

被引:0
|
作者
Saleva, Jonne [1 ]
Lignos, Constantine [1 ]
机构
[1] Brandeis Univ, Waltham, MA 02453 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting. We compare segmentations produced by applying BPE at the token or sentence level with morphologically-based segmentations from LMVR and MORSEL. We evaluate translation tasks between English and each of Nepali, Sinhala, and Kazakh, and predict that using morphologically-based segmentation methods would lead to better performance in this setting. However, comparing to BPE, we find that no consistent and reliable differences emerge between the segmentation methods. While morphologically-based methods outperform BPE in a few cases, what performs best tends to vary across tasks, and the performance of segmentation methods is often statistically indistinguishable.
引用
收藏
页码:164 / 174
页数:11
相关论文
共 50 条
  • [1] Morphology-Aware Word-Segmentation in Dialectal Arabic Adaptation of Neural Machine Translation
    Tawfik, Ahmed Y.
    Emam, Mahitab
    Essam, Khaled
    Nabil, Robert
    Hassan, Hany
    [J]. FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 11 - 17
  • [2] A Survey on Low-Resource Neural Machine Translation
    Wang, Rui
    Tan, Xu
    Luo, Renqian
    Qin, Tao
    Liu, Tie-Yan
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4636 - 4643
  • [3] A Survey on Low-resource Neural Machine Translation
    Li, Hong-Zheng
    Feng, Chong
    Huang, He-Yan
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (06): : 1217 - 1231
  • [4] Transformers for Low-resource Neural Machine Translation
    Gezmu, Andargachew Mekonnen
    Nuernberger, Andreas
    [J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2022, : 459 - 466
  • [5] Low-Resource Neural Machine Translation with Neural Episodic Control
    Wu, Nier
    Hou, Hongxu
    Sun, Shuo
    Zheng, Wei
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [6] Low-resource Neural Machine Translation: Methods and Trends
    Shi, Shumin
    Wu, Xing
    Su, Rihai
    Huang, Heyan
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
  • [7] Recent advances of low-resource neural machine translation
    Haque, Rejwanul
    Liu, Chao-Hong
    Way, Andy
    [J]. MACHINE TRANSLATION, 2021, 35 (04) : 451 - 474
  • [8] Neural Machine Translation for Low-resource Languages: A Survey
    Ranathunga, Surangika
    Lee, En-Shiun Annie
    Skenduli, Marjana Prifti
    Shekhar, Ravi
    Alam, Mehreen
    Kaur, Rishemjit
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (11)
  • [9] Data Augmentation for Low-Resource Neural Machine Translation
    Fadaee, Marzieh
    Bisazza, Arianna
    Monz, Christof
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 567 - 573
  • [10] A Strategy for Referential Problem in Low-Resource Neural Machine Translation
    Ji, Yatu
    Shi, Lei
    Su, Yila
    Ren, Qing-dao-er-ji
    Wu, Nier
    Wang, Hongbin
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V, 2021, 12895 : 321 - 332