SWORD: workload-aware data placement and replica selection for cloud data management systems

被引:53
|
作者
Kumar, K. Ashwin [1 ]
Quamar, Abdul [1 ]
Deshpande, Amol [1 ]
Khuller, Samir [1 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
来源
VLDB JOURNAL | 2014年 / 23卷 / 06期
关键词
Cloud data management; Hypergraph partitioning; Data placement; Replication; Resource minimization; Scalability;
D O I
10.1007/s00778-014-0362-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing is increasingly being seen as a way to reduce infrastructure costs and add elasticity, and is being used by a wide range of organizations. Cloud data management systems today need to serve a range of different workloads, from analytical read-heavy workloads to transactional (OLTP) workloads. For both the service providers and the users, it is critical to minimize the consumption of resources like CPU, memory, communication bandwidth, and energy, without compromising on service-level agreements if any. In this article, we develop a workload-aware data placement and replication approach, called SWORD, for minimizing resource consumption in such an environment. Specifically, we monitor and model the expected workload as a hypergraph and develop partitioning techniques that minimize the average query span, i.e., the average number of machines involved in the execution of a query or a transaction. We empirically justify the use of query span as the metric to optimize, for both analytical and transactional workloads, and develop a series of replication and data placement algorithms by drawing connections to several well-studied graph theoretic concepts. We introduce a suite of novel techniques to achieve high scalability by reducing the overhead of partitioning and query routing. To deal with workload changes, we propose an incremental repartitioning technique that modifies data placement in small steps without resorting to complete repartitioning. We propose the use of fine-grained quorums defined at the level of groups of data items to control the cost of distributed updates, improve throughput, and adapt to different workloads. We empirically illustrate the benefits of our approach through a comprehensive experimental evaluation for two classes of workloads. For analytical read-only workloads, we show that our techniques result in significant reduction in total resource consumption. For OLTP workloads, we show that our approach improves transaction latencies and overall throughput by minimizing the number of distributed transactions.
引用
收藏
页码:845 / 870
页数:26
相关论文
共 50 条
  • [1] SWORD: workload-aware data placement and replica selection for cloud data management systems
    K. Ashwin Kumar
    Abdul Quamar
    Amol Deshpande
    Samir Khuller
    [J]. The VLDB Journal, 2014, 23 : 845 - 870
  • [2] Cost-aware automatic scaling and workload-aware replica management for edge-cloud environment
    Li, Chunlin
    Liu, Jun
    Lu, Bo
    Luo, Youlong
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 180
  • [3] Efficient and Adaptable Query Workload-Aware Management for RDF Data
    MahmoudiNasab, Hooran
    Sakr, Sherif
    [J]. WEB INFORMATION SYSTEM ENGINEERING-WISE 2010, 2010, 6488 : 390 - +
  • [4] An Improved Dynamic Data Replica Selection and Placement in Cloud
    Rajalakshmi, A.
    Vijayakumar, D.
    Srinivasagan, K. G.
    [J]. 2014 INTERNATIONAL CONFERENCE ON RECENT TRENDS IN INFORMATION TECHNOLOGY (ICRTIT), 2014,
  • [5] FORESEER: Workload-aware Data Storage for MapReduce
    Zou, Jia
    Shi, Juwei
    Liu, Tongping
    Cao, Zhao
    Wang, Chen
    [J]. 2015 IEEE 35th International Conference on Distributed Computing Systems, 2015, : 746 - 747
  • [6] Workload-aware Power Management of Cluster Systems
    Liu, Zhuo
    Liang, Aihua
    Xiao, Limin
    Ruan, Li
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010), 2010, : 603 - 608
  • [7] A Practical Approach For Workload-Aware Data Movement in Disaggregated Memory Systems
    Puri, Amit
    Bellamkonda, Kartheek
    Narreddy, Kailash
    Jose, John
    Venkatesh, Tamarapalli
    [J]. 2023 IEEE 35TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, SBAC-PAD, 2023, : 78 - 88
  • [8] A Workload-Aware Change Data Capture Framework for Data Warehousing
    Qu, Weiping
    Liu, Xiufeng
    Dessloch, Stefan
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2021), 2021, 12925 : 222 - 231
  • [9] ChewAnalyzer: Workload-Aware Data Management Across Differentiated Storage Pools
    Ge, Xiongzi
    Xie, Xuchao
    Du, David H. C.
    Ganesan, Pradeep
    Hahn, Dennis
    [J]. 2018 IEEE 26TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2018, : 94 - 101
  • [10] Energy-Aware Scheduling Scheme Using Workload-Aware Consolidation Technique in Cloud Data Centres
    Li Hongyou
    Wang Jiangyong
    Peng Jian
    Wang Junfeng
    Liu Tang
    [J]. CHINA COMMUNICATIONS, 2013, 10 (12) : 114 - 124