Acyclic Transformer for Non-Autoregressive Machine Translation

被引:0
|
作者
Huang, Fei [1 ,2 ,3 ]
Zhou, Hao [3 ]
Liu, Yang [2 ]
Li, Hang [3 ]
Huang, Minlie [1 ,2 ]
机构
[1] Tsinghua Univ, CoAI Grp, Beijing, Peoples R China
[2] Tsinghua Univ, State Key Lab Intelligent Technol & Syst, Beijing Natl Res Ctr Informat Sci & Technol, Inst Artificial Intelligence,Dept Comp Sci & Tech, Beijing, Peoples R China
[3] ByteDance AI Lab, Beijing, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive Transformers (NATs) significantly reduce the decoding latency by generating all tokens in parallel. However, such independent predictions prevent NATs from capturing the dependencies between the tokens for generating multiple possible translations. In this paper, we propose Directed Acyclic Transfomer (DA-Transformer), which represents the hidden states in a Directed Acyclic Graph (DAG), where each path of the DAG corresponds to a specific translation. The whole DAG simultaneously captures multiple translations and facilitates fast predictions in a non-autoregressive fashion. Experiments on the raw training data of WMT benchmark show that DA-Transformer substantially outperforms previous NATs by about 3 BLEU on average, which is the first NAT model that achieves competitive results with autoregressive Transformers without relying on knowledge distillation.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Glancing Transformer for Non-Autoregressive Neural Machine Translation
    Qian, Lihua
    Zhou, Hao
    Bao, Yu
    Wang, Mingxuan
    Qiu, Lin
    Zhang, Weinan
    Yu, Yong
    Li, Lei
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1993 - 2003
  • [2] Non-autoregressive Machine Translation with Disentangled Context Transformer
    Kasai, Jungo
    Cross, James
    Ghazvininejad, Marjan
    Gu, Jiatao
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [3] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
    Liu, Ye
    Wan, Yao
    Zhang, Jian-Guo
    Zhao, Wenting
    Yu, Philip S.
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
  • [4] Integrating Translation Memories into Non-Autoregressive Machine Translation
    Xu, Jitao
    Crego, Josep
    Yvon, Francois
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1326 - 1338
  • [5] Enhanced encoder for non-autoregressive machine translation
    Wang, Shuheng
    Shi, Shumin
    Huang, Heyan
    [J]. MACHINE TRANSLATION, 2021, 35 (04) : 595 - 609
  • [6] Non-Autoregressive Machine Translation with Auxiliary Regularization
    Wang, Yiren
    Tian, Fei
    He, Di
    Qin, Tao
    Zhai, ChengXiang
    Liu, Tie-Yan
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5377 - 5384
  • [7] A Survey of Non-Autoregressive Neural Machine Translation
    Li, Feng
    Chen, Jingxian
    Zhang, Xuejun
    [J]. ELECTRONICS, 2023, 12 (13)
  • [8] Non-Autoregressive Machine Translation with Latent Alignments
    Saharia, Chitwan
    Chan, William
    Saxena, Saurabh
    Norouzi, Mohammad
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1098 - 1108
  • [9] Modeling Coverage for Non-Autoregressive Neural Machine Translation
    Shan, Yong
    Feng, Yang
    Shao, Chenze
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [10] Incorporating history and future into non-autoregressive machine translation
    Wang, Shuheng
    Huang, Heyan
    Shi, Shumin
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 77