Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

被引:0
|
作者
Gu, Jiatao [1 ]
Kong, Xiang [2 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fully non-autoregressive neural machine translation (NAT) simultaneously predicts tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline. In this work, we target on closing the performance gap while maintaining the latency advantage. We first inspect the fundamental issues of fully NAT models, and adopt dependency reduction in the learning space of output tokens as the primary guidance. Then, we revisit methods in four different aspects that have been proven effective for improving NAT models, and carefully combine these techniques with necessary modifications. Our extensive experiments on three translation benchmarks show that the proposed system achieves the state-of-the-art results for fully NAT models, and obtains comparable performance with the autoregressive and iterative NAT systems. For instance, one of the proposed models achieves 27.49 BLEU points on WMT14 En-De with 16.5x speed-up compared to similar sized autoregressive baseline under the same inference condition. The implementation of our model is available here(1).
引用
收藏
页码:120 / 133
页数:14
相关论文
共 50 条
  • [31] Non-autoregressive Machine Translation with Disentangled Context Transformer
    Kasai, Jungo
    Cross, James
    Ghazvininejad, Marjan
    Gu, Jiatao
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [32] Non-Autoregressive Document-Level Machine Translation
    Bao, Guangsheng
    Teng, Zhiyang
    Zhou, Hao
    Yan, Jianhao
    Zhang, Yue
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14791 - 14803
  • [33] Efficient Domain Adaptation for Non-Autoregressive Machine Translation
    You, Wangjie
    Guo, Pei
    Li, Juntao
    Chen, Kehai
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13657 - 13670
  • [34] Aligned Cross Entropy for Non-Autoregressive Machine Translation
    Ghazvininejad, Marjan
    Karpukhin, Vladimir
    Zettlemoyer, Luke
    Levy, Omer
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [35] End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
    Libovicky, Jindrich
    Helcl, Jindrich
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3016 - 3021
  • [36] AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
    Song, Jongyoon
    Kim, Sungwon
    Yoon, Sungroh
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1 - 14
  • [37] A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence
    Lim, Yeon-Soo
    Park, Eun-Ju
    Song, Hyun-Je
    Park, Seong-Bae
    IEEE ACCESS, 2022, 10 : 43341 - 43350
  • [38] Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
    Guo, Junliang
    Xu, Linli
    Chen, Enhong
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 376 - 385
  • [39] Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
    Huang, Chenyang
    Huang, Fei
    Zheng, Zaixiang
    Zaiane, Osmar
    Zhou, Hao
    Mou, Lili
    13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 161 - 170
  • [40] NON-AUTOREGRESSIVE MACHINE TRANSLATION WITH A NOVEL MASKED LANGUAGE MODEL
    Li Ke
    Li Jie
    Wangjun
    2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,