Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

被引:0
|
作者
Gu, Jiatao [1 ]
Kong, Xiang [2 ]
机构
[1] Facebook AI Res, Menlo Pk, CA 94025 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fully non-autoregressive neural machine translation (NAT) simultaneously predicts tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline. In this work, we target on closing the performance gap while maintaining the latency advantage. We first inspect the fundamental issues of fully NAT models, and adopt dependency reduction in the learning space of output tokens as the primary guidance. Then, we revisit methods in four different aspects that have been proven effective for improving NAT models, and carefully combine these techniques with necessary modifications. Our extensive experiments on three translation benchmarks show that the proposed system achieves the state-of-the-art results for fully NAT models, and obtains comparable performance with the autoregressive and iterative NAT systems. For instance, one of the proposed models achieves 27.49 BLEU points on WMT14 En-De with 16.5x speed-up compared to similar sized autoregressive baseline under the same inference condition. The implementation of our model is available here(1).
引用
收藏
页码:120 / 133
页数:14
相关论文
共 50 条
  • [21] Non-Autoregressive Machine Translation as Constrained HMM
    Li, Haoran
    Jie, Zhanming
    Lui, Wei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 12361 - 12372
  • [22] Non-Autoregressive Machine Translation with Latent Alignments
    Saharia, Chitwan
    Chan, William
    Saxena, Saurabh
    Norouzi, Mohammad
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1098 - 1108
  • [23] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
    Liu, Ye
    Wan, Yao
    Zhang, Jian-Guo
    Zhao, Wenting
    Yu, Philip S.
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
  • [24] Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation
    Liu, Jinglin
    Ren, Yi
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3861 - 3867
  • [25] Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation
    Shao, Chenze
    Zhang, Jinchao
    Feng, Yang
    Meng, Fandong
    Zhou, Jie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 198 - 205
  • [26] Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework
    Zhu, Minghao
    Wang, Junli
    Yan, Chungang
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 607 - 617
  • [27] Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation
    Guo, Junliang
    Tan, Xu
    Xu, Linli
    Qin, Tao
    Chen, Enhong
    Liu, Tie-Yan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7839 - 7846
  • [28] Incorporating history and future into non-autoregressive machine translation
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    COMPUTER SPEECH AND LANGUAGE, 2022, 77
  • [29] Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
    Helel, Jindrich
    Haddow, Barry
    Birch, Alexandra
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1780 - 1790
  • [30] Aligned Cross Entropy for Non-Autoregressive Machine Translation
    Ghazvininejad, Marjan
    Karpukhin, Vladimir
    Zettlemoyer, Luke
    Levy, Omer
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119