Binary Embedding-based Retrieval at Tencent

被引:2
|
作者
Gan, Yukang [1 ]
Ge, Yixiao [1 ]
Zhou, Chang [2 ]
Su, Shupeng [1 ]
Xu, Zhouchuan [3 ]
Xu, Xuyuan [2 ]
Hui, Quanchao [3 ]
Chen, Xiang [3 ]
Wang, Yexin [2 ]
Shan, Ying [1 ,3 ]
机构
[1] Tencent PCG, ARC Lab, Shenzhen, Peoples R China
[2] Tencent Video, PCG, Shenzhen, Peoples R China
[3] Tencent Search, PCG, Shenzhen, Peoples R China
关键词
embedding-based retrieval; embedding binarization; backward compatibility;
D O I
10.1145/3580305.3599782
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multi-layer perception (MLP) blocks. The bits of transformed binary vectors are jointly determined by the output dimension of MLP blocks (termed..) and the number of residual blocks (termed u), i.e., m x (u + 1). We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy, e.g., only 2 V100 GPU hours are required by millions of vectors for training. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. The technique exploits Single Instruction Multiple Data (SIMD) units widely available in current CPUs. We successfully employed the introduced BEBR to web search and copyright detection of Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities, for instance, natural language processing (NLP) and computer vision (CV). Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30% similar to 50% index costs with almost no loss of accuracy at the system level(1).
引用
收藏
页码:4056 / 4067
页数:12
相关论文
共 50 条
  • [1] Embedding-based Retrieval in Facebook Search
    Huang, Jui-Ting
    Sharma, Ashish
    Sun, Shuying
    Xia, Li
    Zhang, David
    Pronin, Philip
    Padmanabhan, Janani
    Ottaviano, Giuseppe
    Yang, Linjun
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 2553 - 2561
  • [2] Embedding-based Product Retrieval in Taobao Search
    Li, Sen
    Lv, Fuyu
    Jin, Taiwei
    Lin, Guli
    Yang, Keping
    Zeng, Xiaoyi
    Wu, Xiao-Ming
    Ma, Qianli
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3181 - 3189
  • [3] FAERY: An FPGA-accelerated Embedding-based Retrieval System
    Zeng, Chaoliang
    Luo, Layong
    Ning, Qingsong
    Han, Yaodong
    Jiang, Yuhang
    Tang, Ding
    Wang, Zilong
    Chen, Kai
    Guo, Chuanxiong
    PROCEEDINGS OF THE 16TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2022, 2022, : 841 - 856
  • [4] Dynamic Embedding-based Retrieval for Personalized Item Recommendations at Instacart
    Ruan, Chuanwei
    Stewart, Allan
    Li, Han
    Ye, Ryan
    Vengerov, David
    Wang, Haixun
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 983 - 987
  • [5] A Comparison Between Term-Based and Embedding-Based Methods for Initial Retrieval
    Guo, Tonglei
    Guo, Jiafeng
    Fan, Yixing
    Lan, Yanyan
    Xu, Jun
    Cheng, Xueqi
    INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 28 - 40
  • [6] Contextual Path Retrieval: A Contextual Entity Relation Embedding-based Approach
    Lo, Pei-Chi
    Lim, Ee-Peng
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (01)
  • [7] Improving Embedding-Based Retrieval in Friend Recommendation with ANN Query Expansion
    Kung, Pau Perng-Hwa
    Fan, Zihao
    Zhao, Tong
    Liu, Yozen
    Lai, Zhixin
    Shi, Jiahui
    Wu, Yan
    Yu, Jun
    Shah, Neil
    Venkataraman, Ganesh
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2930 - 2934
  • [8] Embedding-based Query Expansion for Weighted Sequential Dependence Retrieval Model
    Balaneshin-kordan, Saeid
    Kotov, Alexander
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1213 - 1216
  • [9] QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval
    Liu, Peiyang
    Wang, Sen
    Wang, Xi
    Ye, Wei
    Zhang, Shikun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3734 - 3739
  • [10] Spectral embedding-based multiview features fusion for content-based image retrieval
    Feng, Lin
    Yu, Laihang
    Zhu, Hai
    JOURNAL OF ELECTRONIC IMAGING, 2017, 26 (05)