DuoSQL: towards elastic data warehousing via separated data management and processing

被引:0
|
作者
Zhang, Weikang [1 ]
Liu, Zhi [2 ]
Bai, Tongxin [3 ]
Zheng, Furong [4 ]
Jin, Wenming [4 ]
Wang, Yang [2 ]
机构
[1] Southern Univ Sci & Technol, Coll Engn, Shenzhen 518055, Guangdong, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518000, Guangdong, Peoples R China
[3] Beijing Acad Artificial Intelligence, Beijing 100084, Peoples R China
[4] SIAT Suntang Big Data & AI Joint Innovat Lab, Shenzhen 518052, Guangdong, Peoples R China
来源
关键词
TUNING SYSTEM; AWARE;
D O I
10.1093/comjnl/bxaf014
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Moving data warehouses (DWs) to the cloud is what today's companies consider a trend towards cost-effective data management. To fully achieve the goal, the cloud DW system is supposed to adjust its resource provisioning to adapt to changing workload requirements. However, traditional data warehousing architecture lacks the flexibility for on-demand resource control, which severely restricts cost optimization and quality of service for both cloud providers and users. To build cloud DWs, new architectures are needed. This paper explores an architecture that decouples data management and processing to enable on-demand resource control. This optimized design enhances system elasticity and adaptability. However, this separation design is not without cost, as cooperation overhead can be high if not well optimized. For proof of concept, we build a prototype system, DuoSQL, using PostgreSQL for data management and Spark for data processing. To optimize cooperation, we conduct joint parameter tuning to improve overall system performance. We validate the system with the TPC-H benchmark. Results show the decoupling approach is flexible and offers significant performance potential.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Towards data warehousing and mining of protein unfolding simulation data
    Berrar D.
    Stahl F.
    Silva C.
    Rodrigues J.R.
    Brito R.M.M.
    Dubitzky W.
    Journal of Clinical Monitoring and Computing, 2005, 19 (4-5) : 307 - 317
  • [2] DCM Data Management Framework: A Data Warehousing Approach
    Khalid, Shehla
    Surr, Claire
    Neagu, Daniel
    INFORMATION TECHNOLOGY IN BIO- AND MEDICAL INFORMATICS, 2010, 6266 : 45 - +
  • [3] Towards Evolving Constraints in Data Transformation for XML Data Warehousing
    Shahriar, Md. Sumon
    Lin, Jixue
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 79 - 86
  • [4] Data Mining and Data Warehousing for Supply Chain Management
    Kamble, Shridhar
    Desai, Aaditya
    Vartak, Priya
    2015 INTERNATIONAL CONFERENCE ON COMMUNICATION, INFORMATION & COMPUTING TECHNOLOGY (ICCICT), 2015,
  • [5] Data warehousing, technology assessment and management
    Ma, C
    Chou, DC
    Yen, DC
    INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2000, 100 (3-4) : 125 - 134
  • [6] Data warehousing for construction equipment management
    Fan, Hongqin
    Kim, Hyoungkwan
    Zaiane, Osmar R.
    CANADIAN JOURNAL OF CIVIL ENGINEERING, 2006, 33 (12) : 1480 - 1489
  • [7] A case of data warehousing project management
    Shin, BS
    INFORMATION & MANAGEMENT, 2002, 39 (07) : 581 - 592
  • [8] Towards scalable architectures for clickstream data warehousing
    Alvaro, Peter
    Ryaboy, Dmitriy V.
    Agrawal, Divyakant
    DATABASES IN NETWORKED INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4777 : 154 - +
  • [9] Towards a framework for evaluating investments in data warehousing
    Counihan, A
    Finnegan, P
    Sammon, D
    INFORMATION SYSTEMS JOURNAL, 2002, 12 (04) : 321 - 338
  • [10] Metadata management for data warehousing: An overview
    Vaduva, A
    Vetterli, T
    INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2001, 10 (03) : 273 - 298