Binary Embedding-based Retrieval at Tencent

被引：2

作者：

Gan, Yukang ^{[1
]}

Ge, Yixiao ^{[1
]}

Zhou, Chang ^{[2
]}

Su, Shupeng ^{[1
]}

Xu, Zhouchuan ^{[3
]}

Xu, Xuyuan ^{[2
]}

Hui, Quanchao ^{[3
]}

Chen, Xiang ^{[3
]}

Wang, Yexin ^{[2
]}

Shan, Ying ^{[1
,3
]}

机构：

[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China

[2] Tencent Video, PCG, Shenzhen, Peoples R China

[3] Tencent Search, PCG, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

embedding-based retrieval; embedding binarization; backward compatibility;

D O I：

10.1145/3580305.3599782

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).

引用

页码：4056 / 4067

页数：12

共 50 条

[31] Explanations for Network Embedding-Based Link Predictions
Kang, Bo
Lijffijt, Jefrey
De Bie, Tijl
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 473 - 488
[32] MEAL: Manifold Embedding-based Active Learning
Sreenivasaiah, Deepthi
Otterbach, Johannes
Wollmann, Thomas
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1029 - 1037
[33] An Embedding-Based Topic Model for Document Classification
Seifollahi, Sattar
Piccardi, Massimo
Jolfaei, Alireza
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)
[34] An Embedding-Based Approach to Repairing Question Semantics
Zhou, Haixin
Wang, Kewen
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 107 - 122
[35] Embedding-based approximate query for knowledge graph
Qiu, Jingyi
Zhang, Duxi
Song, Aibo
Wang, Honglin
Zhang, Tianbo
Jin, Jiahui
Fang, Xiaolin
Li, Yaqi
Journal of Southeast University (English Edition), 2024, 40 (04) : 417 - 424
[36] EMBEDDING-BASED INTERPOLATION ON THE SPECIAL ORTHOGONAL GROUP
Gawlik, Evan S.
Leok, Melvin
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2018, 40 (02): : A721 - A746
[37] An Embedding-based Approach to Recommending SPARQL Queries
Zhang, Lijing
Zhang, Xiaowang
Feng, Zhiyong
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 991 - 998
[38] SEMS: Scalable Embedding Memory System for Accelerating Embedding-Based DNNs
Kim, Sejin
Kim, Jungwoo
Jang, Yongjoo
Kung, Jaeha
Lee, Sungjin
IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (02) : 157 - 160
[39] Word Embedding-Based Topic Similarity Measures
Terragni, Silvia
Fersini, Elisabetta
Messina, Enza
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 33 - 45
[40] Neural embedding-based indices for semantic search
Lashkari, Fatemeh
Bagheri, Ebrahim
Ghorbani, Ali A.
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) : 733 - 755

← 1 2 3 4 5 →