Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

被引:0
|
作者
Hao, Yongchang [1 ]
He, Shilin [2 ]
Jiao, Wenxiang [2 ]
Tu, Zhaopeng [3 ]
Lyu, Michael R. [2 ]
Wang, Xing [3 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) knowledge to NAT models, e.g., with knowledge distillation. In this work, we hypothesize and empirically verify that AT and NAT encoders capture different linguistic properties of source sentences. Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to NAT models through encoder sharing. Specifically, we take the AT model as an auxiliary task to enhance NAT model performance. Experimental results on WMT14 English <-> German and WMT16 English <-> Romanian datasets show that the proposed MULTI-TASK NAT achieves significant improvements over the baseline NAT models. Furthermore, the performance on large-scale WMT19 and WMT20 English <-> German datasets confirm the consistency of our proposed method. In addition, experimental results demonstrate that our MULTI-TASK NAT is complementary to knowledge distillation, the standard knowledge transfer method for NAT.(1)
引用
收藏
页码:3989 / 3996
页数:8
相关论文
共 50 条
  • [41] Improving Machine Translation of Arabic Dialects Through Multi-task Learning
    Moukafih, Youness
    Sbihi, Nada
    Ghogho, Mounir
    Smaili, Kamel
    [J]. AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 580 - 590
  • [42] Non-autoregressive Machine Translation with Probabilistic Context-free Grammar
    Gui, Shangtong
    Shao, Chenze
    Ma, Zhengrui
    Zhang, Xishan
    Chen, Yunji
    Feng, Yang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [43] Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information
    Ran, Qiu
    Lin, Yankai
    Li, Peng
    Zhou, Jie
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13727 - 13735
  • [44] Autocorrect in the Process of Translation- Multi-task Learning Improves Dialogue Machine Translation
    Wang, Tao
    Zhao, Chengqi
    Wang, Mingxuan
    Li, Lei
    Xiong, Deyi
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 105 - 112
  • [45] Incorporating a local translation mechanism into non-autoregressive translation
    Kong, Xiang
    Zhang, Zhisong
    Hovy, Eduard
    [J]. arXiv, 2020,
  • [46] On the Learning of Non-Autoregressive Transformers
    Huang, Fei
    Tao, Tianhua
    Zhou, Hao
    Li, Lei
    Huang, Minlie
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [47] Learning with Partially Shared Features for Multi-Task Learning
    Liu, Cheng
    Cao, Wen-Ming
    Zheng, Chu-Tao
    Wong, Hau-San
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 95 - 104
  • [48] Incorporating a Local Translation Mechanism into Non-autoregressive Translation
    Kong, Xiang
    Zhang, Zhisong
    Hovy, Eduard
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1067 - 1073
  • [49] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
    Liu, Ye
    Wan, Yao
    Zhang, Jian-Guo
    Zhao, Wenting
    Yu, Philip S.
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
  • [50] Multi-task Learning using Multi-modal Encoder-Decoder Networks with Shared Skip Connections
    Kuga, Ryohei
    Kanezaki, Asako
    Samejima, Masaki
    Sugano, Yusuke
    Matsushita, Yasuyuki
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 403 - 411