Binary Embedding-based Retrieval at Tencent

被引：2

作者：

Gan, Yukang ^{[1
]}

Ge, Yixiao ^{[1
]}

Zhou, Chang ^{[2
]}

Su, Shupeng ^{[1
]}

Xu, Zhouchuan ^{[3
]}

Xu, Xuyuan ^{[2
]}

Hui, Quanchao ^{[3
]}

Chen, Xiang ^{[3
]}

Wang, Yexin ^{[2
]}

Shan, Ying ^{[1
,3
]}

机构：

[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China

[2] Tencent Video, PCG, Shenzhen, Peoples R China

[3] Tencent Search, PCG, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

embedding-based retrieval; embedding binarization; backward compatibility;

D O I：

10.1145/3580305.3599782

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).

引用

页码：4056 / 4067

页数：12

共 50 条

[1] Embedding-based Retrieval in Facebook Search
Huang, Jui-Ting
Sharma, Ashish
Sun, Shuying
Xia, Li
Zhang, David
Pronin, Philip
Padmanabhan, Janani
Ottaviano, Giuseppe
Yang, Linjun
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2553 - 2561
[2] Embedding-based Product Retrieval in Taobao Search
Li, Sen
Lv, Fuyu
Jin, Taiwei
Lin, Guli
Yang, Keping
Zeng, Xiaoyi
Wu, Xiao-Ming
Ma, Qianli
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3181 - 3189
[3] FAERY: An FPGA-accelerated Embedding-based Retrieval System
Zeng, Chaoliang
Luo, Layong
Ning, Qingsong
Han, Yaodong
Jiang, Yuhang
Tang, Ding
Wang, Zilong
Chen, Kai
Guo, Chuanxiong
PROCEEDINGS OF THE 16TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2022, 2022, : 841 - 856
[4] Dynamic Embedding-based Retrieval for Personalized Item Recommendations at Instacart
Ruan, Chuanwei
Stewart, Allan
Li, Han
Ye, Ryan
Vengerov, David
Wang, Haixun
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 983 - 987
[5] A Comparison Between Term-Based and Embedding-Based Methods for Initial Retrieval
Guo, Tonglei
Guo, Jiafeng
Fan, Yixing
Lan, Yanyan
Xu, Jun
Cheng, Xueqi
INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 28 - 40
[6] Contextual Path Retrieval: A Contextual Entity Relation Embedding-based Approach
Lo, Pei-Chi
Lim, Ee-Peng
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (01)
[7] Improving Embedding-Based Retrieval in Friend Recommendation with ANN Query Expansion
Kung, Pau Perng-Hwa
Fan, Zihao
Zhao, Tong
Liu, Yozen
Lai, Zhixin
Shi, Jiahui
Wu, Yan
Yu, Jun
Shah, Neil
Venkataraman, Ganesh
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2930 - 2934
[8] Embedding-based Query Expansion for Weighted Sequential Dependence Retrieval Model
Balaneshin-kordan, Saeid
Kotov, Alexander
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1213 - 1216
[9] QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval
Liu, Peiyang
Wang, Sen
Wang, Xi
Ye, Wei
Zhang, Shikun
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3734 - 3739
[10] Spectral embedding-based multiview features fusion for content-based image retrieval
Feng, Lin
Yu, Laihang
Zhu, Hai
JOURNAL OF ELECTRONIC IMAGING, 2017, 26 (05)

← 1 2 3 4 5 →