Acyclic Transformer for Non-Autoregressive Machine Translation

被引：0

作者：

Huang, Fei ^{[1
,2
,3
]}

Zhou, Hao ^{[3
]}

Liu, Yang ^{[2
]}

Li, Hang ^{[3
]}

Huang, Minlie ^{[1
,2
]}

机构：

[1] Tsinghua Univ, CoAI Grp, Beijing, Peoples R China

[2] Tsinghua Univ, State Key Lab Intelligent Technol & Syst, Beijing Natl Res Ctr Informat Sci & Technol, Inst Artificial Intelligence,Dept Comp Sci & Tech, Beijing, Peoples R China

[3] ByteDance AI Lab, Beijing, Peoples R China

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-autoregressive Transformers (NATs) significantly reduce the decoding latency by generating all tokens in parallel. However, such independent predictions prevent NATs from capturing the dependencies between the tokens for generating multiple possible translations. In this paper, we propose Directed Acyclic Transfomer (DA-Transformer), which represents the hidden states in a Directed Acyclic Graph (DAG), where each path of the DAG corresponds to a specific translation. The whole DAG simultaneously captures multiple translations and facilitates fast predictions in a non-autoregressive fashion. Experiments on the raw training data of WMT benchmark show that DA-Transformer substantially outperforms previous NATs by about 3 BLEU on average, which is the first NAT model that achieves competitive results with autoregressive Transformers without relying on knowledge distillation.

引用

页数：19

共 50 条

[1] Glancing Transformer for Non-Autoregressive Neural Machine Translation
Qian, Lihua
Zhou, Hao
Bao, Yu
Wang, Mingxuan
Qiu, Lin
Zhang, Weinan
Yu, Yong
Li, Lei
[J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1993 - 2003
[2] Non-autoregressive Machine Translation with Disentangled Context Transformer
Kasai, Jungo
Cross, James
Ghazvininejad, Marjan
Gu, Jiatao
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[3] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
Liu, Ye
Wan, Yao
Zhang, Jian-Guo
Zhao, Wenting
Yu, Philip S.
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
[4] Integrating Translation Memories into Non-Autoregressive Machine Translation
Xu, Jitao
Crego, Josep
Yvon, Francois
[J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1326 - 1338
[5] Enhanced encoder for non-autoregressive machine translation
Wang, Shuheng
Shi, Shumin
Huang, Heyan
[J]. MACHINE TRANSLATION, 2021, 35 (04) : 595 - 609
[6] Non-Autoregressive Machine Translation with Auxiliary Regularization
Wang, Yiren
Tian, Fei
He, Di
Qin, Tao
Zhai, ChengXiang
Liu, Tie-Yan
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5377 - 5384
[7] A Survey of Non-Autoregressive Neural Machine Translation
Li, Feng
Chen, Jingxian
Zhang, Xuejun
[J]. ELECTRONICS, 2023, 12 (13)
[8] Non-Autoregressive Machine Translation with Latent Alignments
Saharia, Chitwan
Chan, William
Saxena, Saurabh
Norouzi, Mohammad
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1098 - 1108
[9] Modeling Coverage for Non-Autoregressive Neural Machine Translation
Shan, Yong
Feng, Yang
Shao, Chenze
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[10] Incorporating history and future into non-autoregressive machine translation
Wang, Shuheng
Huang, Heyan
Shi, Shumin
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 77

← 1 2 3 4 5 →