Understanding and Improving Hidden Representation for Neural Machine Translation

被引：0

作者：

Li, Guanlin ^{[1
]}

Liu, Lemao ^{[2
]}

Li, Xintong ^{[3
]}

Zhu, Conghui ^{[1
]}

Zhao, Tiejun ^{[1
]}

Shi, Shuming ^{[2
]}

机构：

[1] Harbin Inst Technol, Harbin, Peoples R China

[2] Tencent AI Lab, Bellevue, WA USA

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

来源：

2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1 | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multilayer architectures are currently the gold standard for large-scale neural machine translation. Existing works have explored some methods for understanding the hidden representations, however, they have not sought to improve the translation quality rationally according to their understanding. Towards understanding for performance improvement, we first artificially construct a sequence of nested relative tasks and measure the feature generalization ability of the learned hidden representation over these tasks. Based on our understanding, we then propose to regularize the layer-wise representations with all treeinduced tasks. To overcome the computational bottleneck resulting from the large number of regularization terms, we design efficient approximation methods by selecting a few coarse-to-fine tasks for regularization. Extensive experiments on two widely-used datasets demonstrate the proposed methods only lead to small extra overheads in training but no additional overheads in testing, and achieve consistent improvements (up to +1.3 BLEU) compared to the state-of-the-art translation model.

引用

页码：466 / 477

页数：12

共 50 条

[41] Improving Transformer-Based Neural Machine Translation with Prior Alignments
Nguyen, Thien
Nguyen, Lam
Tran, Phuoc
Nguyen, Huu
[J]. COMPLEXITY, 2021, 2021
[42] Improving Robustness of Neural Machine Translation with Multi-task Learning
Zhou, Shuyan
Zeng, Xiangkai
Zhou, Yingqi
Anastasopoulos, Antonios
Neubig, Graham
[J]. FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 565 - 571
[43] Improving Stylized Neural Machine Translation with Iterative Dual Knowledge Transfer
Wu, Xuanxuan
Liu, Jian
Li, Xinjie
Xu, Jinan
Chen, Yufeng
Zhang, Yujie
Huang, Hui
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3971 - 3977
[44] Improving Document-Level Neural Machine Translation with Domain Adaptation
Ul Haq, Sami
Rauf, Sadaf Abdul
Shoukat, Arslan
Noor-e-Hira
[J]. NEURAL GENERATION AND TRANSLATION, 2020, : 225 - 231
[45] Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
Xu, Weijia
Agrawal, Sweta
Briakou, Eleftheria
Martindale, Marianna J.
Carpuat, Marine
[J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 546 - 564
[46] Improving thai-lao neural machine translation with similarity lexicon
Yu, Zhiqiang
Huang, Yuxin
Guo, Junjun
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (04) : 4005 - 4014
[47] Improving Mongolian-Chinese Neural Machine Translation with Morphological Noise
Ji, Yatu
Hou, Hongxu
Wu, Nier
Chen, Junjie
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 123 - 135
[48] Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation
Mueller, Mathias
Sennrich, Rico
[J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 259 - 272
[49] Improving Non-autoregressive Neural Machine Translation with Monolingual Data
Zhou, Jiawei
Keung, Phillip
[J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1893 - 1898
[50] Improving Beam Search by Removing Monotonic Constraint for Neural Machine Translation
Shu, Raphael
Nakayama, Hideki
[J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 339 - 344

← 1 2 3 4 5 →