Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

被引：0

作者：

Gu, Jiatao ^{[1
]}

Kong, Xiang ^{[2
]}

机构：

[1] Facebook AI Res, Menlo Pk, CA 94025 USA

[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fully non-autoregressive neural machine translation (NAT) simultaneously predicts tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline. In this work, we target on closing the performance gap while maintaining the latency advantage. We first inspect the fundamental issues of fully NAT models, and adopt dependency reduction in the learning space of output tokens as the primary guidance. Then, we revisit methods in four different aspects that have been proven effective for improving NAT models, and carefully combine these techniques with necessary modifications. Our extensive experiments on three translation benchmarks show that the proposed system achieves the state-of-the-art results for fully NAT models, and obtains comparable performance with the autoregressive and iterative NAT systems. For instance, one of the proposed models achieves 27.49 BLEU points on WMT14 En-De with 16.5x speed-up compared to similar sized autoregressive baseline under the same inference condition. The implementation of our model is available here(1).

引用

页码：120 / 133

页数：14

共 50 条

[31] Non-autoregressive Machine Translation with Disentangled Context Transformer
Kasai, Jungo
Cross, James
Ghazvininejad, Marjan
Gu, Jiatao
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[32] Non-Autoregressive Document-Level Machine Translation
Bao, Guangsheng
Teng, Zhiyang
Zhou, Hao
Yan, Jianhao
Zhang, Yue
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14791 - 14803
[33] Efficient Domain Adaptation for Non-Autoregressive Machine Translation
You, Wangjie
Guo, Pei
Li, Juntao
Chen, Kehai
Zhang, Min
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13657 - 13670
[34] Aligned Cross Entropy for Non-Autoregressive Machine Translation
Ghazvininejad, Marjan
Karpukhin, Vladimir
Zettlemoyer, Luke
Levy, Omer
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[35] End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Libovicky, Jindrich
Helcl, Jindrich
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3016 - 3021
[36] AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Song, Jongyoon
Kim, Sungwon
Yoon, Sungroh
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1 - 14
[37] A Non-Autoregressive Neural Machine Translation Model With Iterative Length Update of Target Sentence
Lim, Yeon-Soo
Park, Eun-Ju
Song, Hyun-Je
Park, Seong-Bae
IEEE ACCESS, 2022, 10 : 43341 - 43350
[38] Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation
Guo, Junliang
Xu, Linli
Chen, Enhong
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 376 - 385
[39] Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation
Huang, Chenyang
Huang, Fei
Zheng, Zaixiang
Zaiane, Osmar
Zhou, Hao
Mou, Lili
13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 161 - 170
[40] NON-AUTOREGRESSIVE MACHINE TRANSLATION WITH A NOVEL MASKED LANGUAGE MODEL
Li Ke
Li Jie
Wangjun
2022 19TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2022,

← 1 2 3 4 5 →