Locality-Sensitive Hashing for Long Context Neural Machine Translation

被引：0

作者：

Petrick, Frithjof ^{[1
]}

Rosendahl, Jan ^{[1
]}

Herold, Christian ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, D-52056 Aachen, Germany

来源：

PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022) | 2022年

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

After its introduction, the Transformer architecture (Vaswani et al., 2017) quickly became the gold standard for the task of neural machine translation. A major advantage of the Transformer compared to previous architectures is the faster training speed achieved by complete parallelization across timesteps due to the use of attention over recurrent layers. However, this also leads to one of the biggest problems of the Transformer, namely the quadratic time and memory complexity with respect to the input length. In this work we adapt the locality-sensitive hashing approach of Kitaev et al. (2020) to self-attention in the Transformer, we extended it to cross-attention and apply this memory efficient framework to sentence- and document-level machine translation. Our experiments show that the LSH attention scheme for sentence-level comes at the cost of slightly reduced translation quality. For document-level NMT we are able to include much bigger context sizes than what is possible with the baseline Transformer. However, more context does neither improve translation quality nor improve scores on targeted test suites.

引用

页码：32 / 42

页数：11

共 50 条

[1] In Defense of Locality-Sensitive Hashing
Ding, Kun
Huo, Chunlei
Fan, Bin
Xiang, Shiming
Pan, Chunhong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (01) : 87 - 103
[2] Kernelized Locality-Sensitive Hashing
Kulis, Brian
Grauman, Kristen
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (06) : 1092 - 1104
[3] Correlated Locality-Sensitive Hashing
Pagh, Rasmus
[J]. ALGORITHMS - ESA 2015, 2015, 9294
[4] Can LSH (locality-sensitive hashing) be replaced by neural network?
Liu, Renyang
Zhao, Jun
Chu, Xing
Liang, Yu
Zhou, Wei
He, Jing
[J]. SOFT COMPUTING, 2024, 28 (02) : 887 - 902
[5] Can LSH (locality-sensitive hashing) be replaced by neural network?
Renyang Liu
Jun Zhao
Xing Chu
Yu Liang
Wei Zhou
Jing He
[J]. Soft Computing, 2024, 28 : 1041 - 1053
[6] An Improved Algorithm for Locality-Sensitive Hashing
Cen, Wei
Miao, Kehua
[J]. 10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015), 2015, : 61 - 64
[7] Bit Reduction for Locality-Sensitive Hashing
Liu, Huawen
Zhou, Wenhua
Zhang, Hong
Li, Gang
Zhang, Shichao
Li, Xuelong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12470 - 12481
[8] Optimal Parameters for Locality-Sensitive Hashing
Slaney, Malcolm
Lifshits, Yury
He, Junfeng
[J]. PROCEEDINGS OF THE IEEE, 2012, 100 (09) : 2604 - 2623
[9] Locality-sensitive hashing for the edit distance
Marcais, Guillaume
DeBlasio, Dan
Pandey, Prashant
Kingsford, Carl
[J]. BIOINFORMATICS, 2019, 35 (14) : I127 - I135
[10] Using Locality-sensitive Hashing for Rendezvous Search
Jiang, Guann-Yng
Chang, Cheng-Shang
[J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1743 - 1749

← 1 2 3 4 5 →