Transformer Memory as a Differentiable Search Index

被引：0

作者：

Tay, Yi ^{[1
]}

Tran, Vinh Q. ^{[1
]}

Dehghani, Mostafa ^{[1
]}

Ni, Jianmo ^{[1
]}

Bahri, Dara ^{[1
]}

Mehta, Harsh ^{[1
]}

Qin, Zhen ^{[1
]}

Hui, Kai ^{[1
]}

Zhao, Zhe ^{[1
]}

Gupta, Jai ^{[1
]}

Schuster, Tal ^{[1
]}

Cohen, WilliamW. ^{[1
]}

Metzler, Donald ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model. To this end, we introduce the Differentiable Search Index (DSI), a new paradigm that learns a text-to-text model that maps string queries directly to relevant docids; in other words, a DSI model answers queries directly using only its parameters, dramatically simplifying the whole retrieval process. We study variations in how documents and their identifiers are represented, variations in training procedures, and the interplay between models and corpus sizes. Experiments demonstrate that given appropriate design choices, DSI significantly outperforms strong baselines such as dual encoder models. Moreover, DSI demonstrates strong generalization capabilities, outperforming a BM25 baseline in a zero-shot setup.

引用

页数：13

共 50 条

[41] Transformer with Memory Replay
Liu, Rui
Mozafari, Barzan
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7567 - 7575
[42] Recurrent Memory Transformer
Bulatov, Aydar
Kuratov, Yuri
Burtsev, Mikhail S.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[43] On the uniqueness of the coincidence index on orientable differentiable manifolds
Staecker, P. Christopher
[J]. TOPOLOGY AND ITS APPLICATIONS, 2007, 154 (09) : 1961 - 1970
[44] Differentiable Slimming for Memory-Efficient Transformers
Penkov, Nikolay
Balaskas, Konstantinos
Rapp, Martin
Henkel, Joerg
[J]. IEEE EMBEDDED SYSTEMS LETTERS, 2023, 15 (04) : 186 - 189
[45] Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation
Chen, Xin
Xie, Lingxi
Wu, Jun
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1294 - 1303
[46] Differentiable neural architecture search with channel performance measurement
Pan, Jie
Zheng, Xue-Chi
Zou, Xiao-Yu
[J]. Kongzhi yu Juece/Control and Decision, 2024, 39 (07): : 2151 - 2160
[47] MergeNAS: Merge Operations into One for Differentiable Architecture Search
Wang, Xiaoxing
Xue, Chao
Yan, Junchi
Yang, Xiaokang
Hu, Yonggang
Sun, Kewei
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3065 - 3072
[48] Enhanced Differentiable Architecture Search Based on Asymptotic Regularization
Jin, Cong
Huang, Jinjie
Chen, Yuanjian
Gong, Yuqing
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (02): : 1547 - 1568
[49] D-DARTS: Distributed Differentiable Architecture Search
Heuillet, Alexandre
Tabia, Hedi
Arioui, Hichem
Youcef-Toumi, Kamal
[J]. PATTERN RECOGNITION LETTERS, 2023, 176 : 42 - 48
[50] DOTS: Decoupling Operation and Topology in Differentiable Architecture Search
Gu, Yu-Chao
Wang, Li-Juan
Liu, Yun
Yang, Yi
Wu, Yu-Huan
Lu, Shao-Ping
Cheng, Ming-Ming
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12306 - 12315

← 1 2 3 4 5 →