Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

被引:0
|
作者
Gu, Jiatao [1 ]
Kong, Xiang [2 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fully non-autoregressive neural machine translation (NAT) simultaneously predicts tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline. In this work, we target on closing the performance gap while maintaining the latency advantage. We first inspect the fundamental issues of fully NAT models, and adopt dependency reduction in the learning space of output tokens as the primary guidance. Then, we revisit methods in four different aspects that have been proven effective for improving NAT models, and carefully combine these techniques with necessary modifications. Our extensive experiments on three translation benchmarks show that the proposed system achieves the state-of-the-art results for fully NAT models, and obtains comparable performance with the autoregressive and iterative NAT systems. For instance, one of the proposed models achieves 27.49 BLEU points on WMT14 En-De with 16.5x speed-up compared to similar sized autoregressive baseline under the same inference condition. The implementation of our model is available here(1).
引用
收藏
页码:120 / 133
页数:14
相关论文
共 50 条
  • [41] Hint-Based Training for Non-Autoregressive Machine Translation
    Li, Zhuohan
    Lin, Zi
    He, Di
    Tian, Fei
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5708 - 5713
  • [42] Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
    Ran, Qiu
    Lin, Yankai
    Li, Peng
    Zhou, Jie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3059 - 3069
  • [43] Correcting translation for non-autoregressive transformer
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    Li, Dongbai
    Guo, Dongen
    APPLIED SOFT COMPUTING, 2025, 168
  • [44] Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
    Du, Cunxiao
    Tu, Zhaopeng
    Jiang, Jing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [45] A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation
    Zhang, Kexun
    Wang, Rui
    Tan, Xu
    Guo, Junliang
    Ren, Yi
    Qin, Tao
    Liu, Tie-Yan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1747 - 1757
  • [46] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
    Wang, Shuheng
    Shi, Shumin
    Huang, Heyan
    SOFT COMPUTING, 2024, 28 (5) : 4681 - 4688
  • [47] Alleviating repetitive tokens in non-autoregressive machine translation with unlikelihood training
    Shuheng Wang
    Shumin Shi
    Heyan Huang
    Soft Computing, 2024, 28 : 4681 - 4688
  • [48] Revisiting Non-Autoregressive Translation at Scale
    Wang, Zhihao
    Wang, Longyue
    Su, Jinsong
    Yao, Junfeng
    Tu, Zhaopeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12051 - 12065
  • [49] Non-Autoregressive Fully Parallel Deep Convolutional Neural Speech Synthesis
    Lee, Moa
    Lee, Junmo
    Chang, Joon-Hyuk
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1150 - 1159
  • [50] Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization
    Chen, Xinran
    Duan, Sufeng
    Liu, Gongshen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 240 - 252