Manu: A Cloud Native Vector Database Management System

被引:13
|
作者
Guo, Rentong [1 ]
Luan, Xiaofan [1 ]
Xiang, Long [2 ]
Yan, Xiao [2 ]
Yi, Xiaomeng [1 ]
Luo, Jigao [1 ,3 ]
Cheng, Qianya [1 ]
Xu, Weizhi [1 ]
Luo, Jiarui [2 ]
Liu, Frank [1 ]
Cao, Zhenshan [1 ]
Qiao, Yanliang [1 ]
Wang, Ting [1 ]
Tang, Bo [2 ]
Xie, Charles [1 ]
机构
[1] Zilliz, Redwood City, CA 94065 USA
[2] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[3] Tech Univ Munich, Munich, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 15卷 / 12期
关键词
NEAREST-NEIGHBOR SEARCH; PRODUCT QUANTIZATION;
D O I
10.14778/3554821.3554843
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of learning-based embedding models, embedding vectors are widely used for analyzing and searching unstructured data. As vector collections exceed billion-scale, fully managed and horizontally scalable vector databases are necessary. In the past three years, through interaction with our 1200+ industry users, we have sketched a vision for the features that next-generation vector databases should have, which include long-term evolvability, tunable consistency, good elasticity, and high performance. We present Manu, a cloud native vector database that implements these features. It is difficult to integrate all these features if we follow traditional DBMS design rules. As most vector data applications do not require complex data models and strong data consistency, our design philosophy is to relax the data model and consistency constraints in exchange for the aforementioned features. Specifically, Manu firstly exposes the write-ahead log (WAL) and binlog as backbone services. Secondly, write components are designed as log publishers while all read-only analytic and search components are designed as independent subscribers to the log services. Finally, we utilize multi-version concurrency control (MVCC) and a delta consistency model to simplify the communication and cooperation among the system components. These designs achieve a low coupling among the system components, which is essential for elasticity and evolution. We also extensively optimize Manu for performance and usability with hardware-aware implementations and support for complex search semantics. Manu has been used for many applications, including, but not limited to, recommendation, multimedia, language, medicine and security. We evaluated Manu in three typical application scenarios to demonstrate its efficiency, elasticity, and scalability.
引用
收藏
页码:3548 / 3561
页数:14
相关论文
共 50 条
  • [31] Autonomic Management Framework for Cloud-Native Applications
    Kosinska, Joanna
    Zielinski, Krzysztof
    JOURNAL OF GRID COMPUTING, 2020, 18 (04) : 779 - 796
  • [32] Autonomic Management Framework for Cloud-Native Applications
    Joanna Kosińska
    Krzysztof Zieliński
    Journal of Grid Computing, 2020, 18 : 779 - 796
  • [33] Enabling Cloud-native IoT Device Management
    Nanos, Anastassios
    Plakas, Ioannis
    Ntoutsos, Georgios
    Mainas, Charalampos
    PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON METAOS FOR THE CLOUD-EDGE-IOT CONTINUUM, MECC 2024, 2024, : 42 - 47
  • [34] A Cloud Database Service Approach to the Management of Sensor Data
    Cui, Zhenguo
    Jiang, Meilan
    Jeong, Karpjoo
    Kim, Bomchul
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA), 2014,
  • [35] A Fragmentation Algorithm for Storage Management in Cloud Database Environment
    Eisa, Islam
    Salem, Rashed
    Abdelkader, Hatem
    2017 12TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2017, : 141 - 147
  • [36] Real time vector database updating system: a case study for Turkish topographic vector database (Topovt)
    Yilmaz, A.
    Caniberk, M.
    INTERNATIONAL JOURNAL OF ENGINEERING AND GEOSCIENCES, 2018, 3 (02): : 73 - 79
  • [37] Utilizing Vector Database Management Systems in Cyber Security
    Taipalus, Toni
    Grahn, Hilkka
    Turtiainen, Hannu
    Costin, Andrei
    PROCEEDINGS OF THE 23RD EUROPEAN CONFERENCE ON CYBER WARFARE AND SECURITY, ECCWS 2024, 2024, 23 : 556 - 561
  • [38] Infrastructure management database system
    Le Diouron, T
    Aouri, M
    Hovhanessian, G
    Wong, SS
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON EARTHQUAKE ENGINEERING: NEW FRONTIER AND RESEARCH TRANSFORMATION, 2004, : 971 - 976
  • [39] DATABASE MANAGEMENT SYSTEM IN PRACTICE
    GARDINER, S
    BAKER, G
    GRADWELL, D
    DATA PROCESSING, 1974, 16 (06): : 377 - &
  • [40] CLOUD NETWORK MANAGEMENT SYSTEM
    Geetha, V
    Reddy, G. Srinivas
    Gomathy, C. K.
    Swecha, Thumu
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (05) : 705 - 709