Manu: A Cloud Native Vector Database Management System

被引:13
|
作者
Guo, Rentong [1 ]
Luan, Xiaofan [1 ]
Xiang, Long [2 ]
Yan, Xiao [2 ]
Yi, Xiaomeng [1 ]
Luo, Jigao [1 ,3 ]
Cheng, Qianya [1 ]
Xu, Weizhi [1 ]
Luo, Jiarui [2 ]
Liu, Frank [1 ]
Cao, Zhenshan [1 ]
Qiao, Yanliang [1 ]
Wang, Ting [1 ]
Tang, Bo [2 ]
Xie, Charles [1 ]
机构
[1] Zilliz, Redwood City, CA 94065 USA
[2] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[3] Tech Univ Munich, Munich, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 15卷 / 12期
关键词
NEAREST-NEIGHBOR SEARCH; PRODUCT QUANTIZATION;
D O I
10.14778/3554821.3554843
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of learning-based embedding models, embedding vectors are widely used for analyzing and searching unstructured data. As vector collections exceed billion-scale, fully managed and horizontally scalable vector databases are necessary. In the past three years, through interaction with our 1200+ industry users, we have sketched a vision for the features that next-generation vector databases should have, which include long-term evolvability, tunable consistency, good elasticity, and high performance. We present Manu, a cloud native vector database that implements these features. It is difficult to integrate all these features if we follow traditional DBMS design rules. As most vector data applications do not require complex data models and strong data consistency, our design philosophy is to relax the data model and consistency constraints in exchange for the aforementioned features. Specifically, Manu firstly exposes the write-ahead log (WAL) and binlog as backbone services. Secondly, write components are designed as log publishers while all read-only analytic and search components are designed as independent subscribers to the log services. Finally, we utilize multi-version concurrency control (MVCC) and a delta consistency model to simplify the communication and cooperation among the system components. These designs achieve a low coupling among the system components, which is essential for elasticity and evolution. We also extensively optimize Manu for performance and usability with hardware-aware implementations and support for complex search semantics. Manu has been used for many applications, including, but not limited to, recommendation, multimedia, language, medicine and security. We evaluated Manu in three typical application scenarios to demonstrate its efficiency, elasticity, and scalability.
引用
收藏
页码:3548 / 3561
页数:14
相关论文
共 50 条
  • [1] Database Management System as a Cloud Service
    Gelogo, Yvette E.
    Lee, Sunguk
    INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2012, 5 (02): : 71 - 76
  • [2] Database management system as a cloud service
    Lee, S. (sunguk@rist.re.kr), 1600, Science and Engineering Research Support Society, Room 402, Man-Je Bld., 449-8, Ojung-Dong, Daedoek-Gu, Korea, Republic of (05):
  • [3] Smart Warehouse Management System with RFID and Cloud Database
    Rashid, Mahmudur
    Ahad, S. M. Abdul
    Siddique, Shahida
    Motahar, Tamanna
    2019 JOINT 8TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2019 3RD INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR) WITH INTERNATIONAL CONFERENCE ON ACTIVITY AND BEHAVIOR COMPUTING (ABC), 2019, : 218 - 222
  • [4] Web Content Management System based on XML native database
    Sokic, M
    Matic, V
    Bazant, A
    ITI 2003: PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2003, : 457 - 462
  • [5] 5-Layered Architecture of Cloud Database Management System
    Alam, Bashir
    Doja, M. N.
    Alam, Mansaf
    Mongia, Shweta
    2013 AASRI CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2013, 5 : 194 - 199
  • [6] Cloud Database Management System security challenges and solutions: an analysis
    Shweta Malhotra
    Mohammad Najmud Doja
    Bashir Alam
    Mansaf Alam
    CSI Transactions on ICT, 2016, 4 (2-4) : 199 - 207
  • [7] Native XML database management
    Wongsaroj, Ben
    Graham, Scott
    Wolfson, Ouri
    Steinhoff, Robert
    Cary, Ariel
    Chang, Lily
    Lee, Allen
    Rodriguez, Ana
    Singh, Peter
    Haynes, Royel
    Rush, Tiya
    Barreto, Armando
    Adjouadi, Malek
    Rishe, Naphtali
    3RD INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS, AND APPLICAT/4TH INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2006, : 22 - 27
  • [8] CDSBen: Benchmarking the Performance of Storage Services in Cloud-native Database System at ByteDance
    Zhang, Jiashu
    Jiang, Wen
    Tang, Bo
    Ma, Haoxiang
    Cao, Lixun
    Jiang, Zhongbin
    Nie, Yuanyuan
    Wang, Fan
    Zhang, Lei
    Liang, Yuming
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3584 - 3596
  • [9] Scalable database management in cloud computing
    Kaur, Pankaj Deep
    Sharma, Gitanjali
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2015, 70 : 658 - 667
  • [10] Implementation of MPEG-7 Document Management System Based on Native Database
    Ahn, Byeong-Tae
    RECENT TRENDS IN NETWORK SECURITY AND APPLICATIONS, 2010, 89 : 625 - 634