Uncertainty-aware non-autoregressive neural machine translation

被引：0

作者：

Liu, Chuanming ^{[1
]}

Yu, Jingqi ^{[2
]}

机构：

[1] Shanghai Jiao Tong Univ, 800, Dongchuan Rd, Shanghai 200240, Peoples R China

[2] CCB Fintech, 99, Yincheng Rd, Shanghai 200120, Peoples R China

来源：

COMPUTER SPEECH AND LANGUAGE | 2023年 / 78卷

关键词：

Bayesian deep learning; Non-autoregressive; Machine translation; Active learning; Monte Carlo dropout;

D O I：

10.1016/j.csl.2022.101444

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most existing non-autoregressive neural machine translation (NAT) models generally employ the posterior probability to indicate the model confidence during training, which seems to lag behind the novel uncertainty estimations (UEs) methods successfully deployed in other natural language processing (NLP) tasks. Previous research has practically ignored the large-scale exploration of UE methods in the NAT problem. In this paper, we propose a strategy based on Active Learning employed to investigate whether these sophisticated uncertainty -aware methods are more effective in the NAT problem. Besides, we provide an in-depth analysis of the impact of different widely employed UE methods and propose several tailored ones. In the end, we incorporate these exceptional ones into the practical one-pass GLAT model to obtain enhanced performance. Experimental results demonstrate that sophisticated uncertainty-aware UE methods with the two-step training paradigm are potentially superior to represent the model confidence in facilitating token-level decision-making compared to the posterior probability in NAT to a certain extent.

引用

页数：14

共 50 条

[21] Non-Autoregressive Machine Translation with Latent Alignments
Saharia, Chitwan
Chan, William
Saxena, Saurabh
Norouzi, Mohammad
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1098 - 1108
[22] Enriching Non-Autoregressive Transformer with Syntactic and Semantic Structures for Neural Machine Translation
Liu, Ye
Wan, Yao
Zhang, Jian-Guo
Zhao, Wenting
Yu, Philip S.
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1235 - 1244
[23] Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation
Liu, Jinglin
Ren, Yi
Tan, Xu
Zhang, Chen
Qin, Tao
Zhao, Zhou
Liu, Tie-Yan
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3861 - 3867
[24] Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework
Zhu, Minghao
Wang, Junli
Yan, Chungang
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 607 - 617
[25] Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation
Guo, Junliang
Tan, Xu
Xu, Linli
Qin, Tao
Chen, Enhong
Liu, Tie-Yan
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7839 - 7846
[26] Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation
Shao, Chenze
Zhang, Jinchao
Feng, Yang
Meng, Fandong
Zhou, Jie
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 198 - 205
[27] Incorporating history and future into non-autoregressive machine translation
Wang, Shuheng
Huang, Heyan
Shi, Shumin
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 77
[28] Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Helel, Jindrich
Haddow, Barry
Birch, Alexandra
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1780 - 1790
[29] Non-autoregressive Machine Translation with Disentangled Context Transformer
Kasai, Jungo
Cross, James
Ghazvininejad, Marjan
Gu, Jiatao
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[30] Aligned Cross Entropy for Non-Autoregressive Machine Translation
Ghazvininejad, Marjan
Karpukhin, Vladimir
Zettlemoyer, Luke
Levy, Omer
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119

← 1 2 3 4 5 →