Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation

被引:0
|
作者
Wang, Qiang [1 ]
Xiao, Tong [2 ,3 ]
Zhu, Jingbo [2 ,3 ]
机构
[1] Alibaba DAMO Acad, Machine Intelligence Technol Lab, Hangzhou, Zhejiang, Peoples R China
[2] Northeastern Univ, Shenyang, Peoples R China
[3] NiuTrans Co Ltd, Shenyang, Peoples R China
基金
美国国家科学基金会; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The standard neural machine translation model can only decode with the same depth configuration as training. Restricted by this feature, we have to deploy models of various sizes to maintain the same translation latency, because the hardware conditions on different terminal devices (e.g., mobile phones) may vary greatly. Such individual training leads to increased model maintenance costs and slower model iterations, especially for the industry. In this work, we propose to use multi-task learning to train a flexible depth model that can adapt to different depth configurations during inference. Experimental results show that our approach can simultaneously support decoding in 24 depth configurations and is superior to the individual training and another flexible depth model training method--LayerDrop.
引用
收藏
页码:4307 / 4312
页数:6
相关论文
共 50 条
  • [1] Multi-task Learning for Multilingual Neural Machine Translation
    Wang, Yiren
    Zhai, ChengXiang
    Awadalla, Hany Hassan
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1022 - 1034
  • [2] Improving Robustness of Neural Machine Translation with Multi-task Learning
    Zhou, Shuyan
    Zeng, Xiangkai
    Zhou, Yingqi
    Anastasopoulos, Antonios
    Neubig, Graham
    [J]. FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 565 - 571
  • [3] Neural Machine Translation Based on Multi-task Learning of Discourse Structure
    Kang, Xiao-Mian
    Zong, Cheng-Qing
    [J]. Ruan Jian Xue Bao/Journal of Software, 2022, 33 (10): : 3806 - 3818
  • [4] Scheduled Multi-task Learning for Neural Chat Translation
    Liang, Yunlong
    Meng, Fandong
    Xu, Jinan
    Chen, Yufeng
    Zhou, Jie
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4375 - 4388
  • [5] Multi-Task Neural Model for Agglutinative Language Translation
    Pan, Yirong
    Li, Xiao
    Yang, Yating
    Dong, Rui
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 103 - 110
  • [6] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
    Mao, Zhuoyuan
    Chu, Chenhui
    Kurohashi, Sadao
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [7] Improving Machine Translation of Arabic Dialects Through Multi-task Learning
    Moukafih, Youness
    Sbihi, Nada
    Ghogho, Mounir
    Smaili, Kamel
    [J]. AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 580 - 590
  • [8] Autocorrect in the Process of Translation- Multi-task Learning Improves Dialogue Machine Translation
    Wang, Tao
    Zhao, Chengqi
    Wang, Mingxuan
    Li, Lei
    Xiong, Deyi
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 105 - 112
  • [9] Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach
    Sanchez-Cartagena, Victor M.
    Espla-Gomis, Miquel
    Antonio Perez-Ortiz, Juan
    Sanchez-Martinez, Felipe
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8502 - 8516
  • [10] Multi-task joint training model for machine reading comprehension
    Li, Fangfang
    Shan, Youran
    Mao, Xingliang
    Ren, Xingkai
    Liu, Xiyao
    Zhang, Shichao
    [J]. NEUROCOMPUTING, 2022, 488 : 66 - 77