Improving Chinese-Vietnamese Neural Machine Translation with Linguistic Differences

被引:1
|
作者
Yu, Zhiqiang [1 ]
Yu, Zhengtao [2 ]
Xian, Yantuan [2 ]
Huang, Yuxin [2 ]
Guo, Junjun [2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Yunnan Minzu Univ, Yunnan Key Lab Artificial Intelligence, 727 South Jingming Rd, Kunming 650500, Yunnan, Peoples R China
[2] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Yunnan Key Lab Artificial Intelligence, Kunming, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Neural machine translation; Chinese-Vietnamese; linguistic difference; data augmentation;
D O I
10.1145/3477536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a simple, efficient data augmentation approach for boosting Chinese-Vietnamese neural machine translation performance by leveraging the linguistic difference between the two languages. We first define the formalized representation o f modifier symmetry, which is one of the most representative linguistic differences between Chinese and Vietnamese. We then propose and test two data augmentation strategies for leveraging the linguistic difference, which can be integrated naturally with different translation models. Results indicate that both strategies can introduce linguistic rules to boost translation accuracy. Tests on Chinese-Vietnamese benchmarks show significant accuracy improvements. To facilitate studies in this domain, we also release an open-source toolkit(1) with flexible implementation for Chinese-Vietnamese linguistic difference tagging.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Linguistic feature template integration for Chinese-Vietnamese neural machine translation
    Zhiqiang Yu
    Yantuan Xian
    Zhengtao Yu
    Yuxin Huang
    Junjun Guo
    [J]. Frontiers of Computer Science, 2022, 16
  • [2] Linguistic feature template integration for Chinese-Vietnamese neural machine translation
    Yu, Zhiqiang
    Xian, Yantuan
    Yu, Zhengtao
    Huang, Yuxin
    Guo, Junjun
    [J]. Frontiers of Computer Science, 2022, 16 (03):
  • [3] Linguistic feature template integration for Chinese-Vietnamese neural machine translation
    Zhiqiang YU
    Yantuan XIAN
    Zhengtao YU
    Yuxin HUANG
    Junjun GUO
    [J]. Frontiers of Computer Science., 2022, 16 (03) - 219
  • [4] Linguistic feature template integration for Chinese-Vietnamese neural machine translation
    YU, Zhiqiang
    XIAN, Yantuan
    YU, Zhengtao
    HUANG, Yuxin
    GUO, Junjun
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (03)
  • [5] Handling syntactic difference in Chinese-Vietnamese neural machine translation
    Yu, Zhiqiang
    Wang, Ting
    Liu, Shihu
    Tan, Xuewen
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 5533 - 5544
  • [6] Improving Parallel Corpus Quality for Chinese-Vietnamese Statistical Machine Translation
    Huu-anh Tran
    Yuhang Guo
    Ping Jian
    Shumin Shi
    Heyan Huang
    [J]. Journal of Beijing Institute of Technology, 2018, 27 (01) : 127 - 136
  • [7] Improving Parallel Corpus Quality for Chinese-Vietnamese Statistical Machine Translation
    Tran, Huu-Anh
    Guo, Yuhang
    Jian, Ping
    Shi, Shumin
    Huang, Heyan
    [J]. Journal of Beijing Institute of Technology (English Edition), 2018, 27 (01): : 127 - 136
  • [8] Preordering for Chinese-Vietnamese Statistical Machine Translation
    Huu-Anh Tran
    Huang, Heyan
    Phuoc Tran
    Shi, Shumin
    Huu Nguyen
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (02): : 375 - 382
  • [9] Integrating Pronunciation into Chinese-Vietnamese Statistical Machine Translation
    Anh Tran Huu
    Huang, Heyan
    Guo, Yuhang
    Shi, Shumin
    Jian, Ping
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2018, 23 (06) : 715 - 723
  • [10] Exploring Machine Translation on the Chinese-Vietnamese Language Pair
    Huu-Anh Tran
    Phuoc Tran
    Phuong-Thuy Dao
    Thi-Mien Pham
    [J]. COMPUTATIONAL DATA AND SOCIAL NETWORKS, 2019, 11917 : 205 - 206