Citus: Distributed PostgreSQL for Data-Intensive Applications

被引:10
|
作者
Cubukcu, Umur [1 ]
Erdogan, Ozgun [1 ]
Pathak, Sumedh [1 ]
Sannakkayala, Sudhakar [1 ]
Slot, Marco [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
postgresql; distributed database; relational database; database extension;
D O I
10.1145/3448016.3457551
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Citus is an open source distributed database engine for PostgreSQL that is implemented as an extension. Citus gives users the ability to distribute data, queries, and transactions in PostgreSQL across a cluster of PostgreSQL servers to handle the needs of data-intensive applications. The development of Citus has largely been driven by conversations with companies looking to scale PostgreSQL beyond a single server and their workload requirements. This paper describes the requirements of four common workload patterns and how Citus addresses those requirements. It also shares benchmark results demonstrating the performance and scalability of Citus in each of the workload patterns and describes how Microsoft uses Citus to address one of its most challenging data problems.
引用
收藏
页码:2490 / 2502
页数:13
相关论文
共 50 条
  • [31] A Distributed Data Management System for Data-intensive Radio Astronomy
    Grimstrup, Arne
    Mahadevan, Venkat
    Eymere, Olivier
    Anderson, Ken
    Kiddle, Cameron
    Simmonds, Rob
    Rosolowsky, Erik
    Taylor, Andrew R.
    [J]. SOFTWARE AND CYBERINFRASTRUCTURE FOR ASTRONOMY II, 2012, 8451
  • [32] On the Flexibility of Data Fulfillment Locations in Data-intensive Distributed Systems
    Yu, Boyang
    Pan, Jianping
    [J]. 2016 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2016,
  • [33] Bridging Network and Parallel I/O Research for Improving Data-Intensive Distributed Applications
    Biswas, Debasmita
    Neuwirth, Sarah
    Paul, Arnab K.
    Butt, Ali R.
    [J]. PROCEEDINGS OF 8TH WORKSHOP ON INNOVATING THE NETWORK FOR DATA-INTENSIVE SCIENCE (INDIS 2021), 2021, : 50 - 56
  • [34] Data-Intensive Scalable Computing for Scientific Applications
    Bryant, Randal E.
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2011, 13 (06) : 25 - 33
  • [35] Improving Parallelism in Data-Intensive Workflows with Distributed Databases
    Watanabe, Elaine Naomi
    Braghetto, Kelly Rosa
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2018), 2018, : 209 - 216
  • [36] IPSO: A Scaling Model for Data-Intensive Applications
    Li, Zhongwei
    Duan, Feng
    Minh Nguyen
    Che, Hao
    Lei, Yu
    Jiang, Hong
    [J]. 2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019), 2019, : 238 - 248
  • [37] Optimizing Interactive Development of Data-Intensive Applications
    Interlandi, Matteo
    Tetali, Sai Deep
    Gulzar, Muhammad Ali
    Noor, Joseph
    Condie, Tyson
    Kim, Miryung
    Millstein, Todd
    [J]. PROCEEDINGS OF THE SEVENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC 2016), 2016, : 510 - 522
  • [38] GORDON:. AN IMPROVED ARCHITECTURE FOR DATA-INTENSIVE APPLICATIONS
    Caulfield, Adrian M.
    Grupp, Laura M.
    Swanson, Steven
    [J]. IEEE MICRO, 2010, 30 (01) : 121 - 130
  • [39] System dynamics simulations for data-intensive applications
    Neuwirth, Christian
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2017, 96 : 140 - 145
  • [40] Enhancing Parallelism of Data-Intensive Bioinformatics Applications
    Xie, Zheng
    Han, Liangxiu
    Baldock, Richard
    [J]. 2013 8TH EUROSIM CONGRESS ON MODELLING AND SIMULATION (EUROSIM), 2013, : 519 - 524