Temporal Workload-Aware Replicated Partitioning for Social Networks

被引:18
|
作者
Turk, Ata [1 ]
Selvitopi, R. Oguz [2 ]
Ferhatosmanoglu, Hakan [2 ]
Aykanat, Cevdet [2 ]
机构
[1] Yahoo Labs, Barcelona, Spain
[2] Bilkent Univ, Bilkent, Turkey
关键词
Cassandra; social network partitioning; selective replication; replicated hypergraph partitioning; twitter; NoSQL;
D O I
10.1109/TKDE.2014.2302291
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most frequent and expensive queries in social networks involve multi-user operations such as requesting the latest tweets or news-feeds of friends. The performance of such queries are heavily dependent on the data partitioning and replication methodologies adopted by the underlying systems. Existing solutions for data distribution in these systems involve hash-or graph-based approaches that ignore the multi-way relations among data. In this work, we propose a novel data partitioning and selective replication method that utilizes the temporal information in prior workloads to predict future query patterns. Our method utilizes the social network structure and the temporality of the interactions among its users to construct a hypergraph that correctly models multi-user operations. It then performs simultaneous partitioning and replication of this hypergraph to reduce the query span while respecting load balance and I/O load constraints under replication. To test our model, we enhance the Cassandra NoSQL system to support selective replication and we implement a social network application (a Twitter clone) utilizing our enhanced Cassandra. We conduct experiments on a cloud computing environment (Amazon EC2) to test the developed systems. Comparison of the proposed method with hash-and enhanced graph-based schemes indicate that it significantly improves latency and throughput.
引用
下载
收藏
页码:2832 / 2845
页数:14
相关论文
共 50 条
  • [1] WARP: Workload-Aware Replication and Partitioning for RDF
    Hose, Katja
    Schenkel, Ralf
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2013, : 1 - 6
  • [2] WISE: Workload-Aware Partitioning for RDF Systems
    Guo, Xintong
    Gao, Hong
    Zou, Zhaonian
    BIG DATA RESEARCH, 2020, 22
  • [3] GeoBalance: workload-aware partitioning of real-time spatiotemporal data
    Soltani, Kiumars
    Padmanabhan, Anand
    Wang, Shaowen
    GEOINFORMATICA, 2022, 26 (01) : 67 - 94
  • [4] GeoBalance: workload-aware partitioning of real-time spatiotemporal data
    Kiumars Soltani
    Anand Padmanabhan
    Shaowen Wang
    GeoInformatica, 2022, 26 : 67 - 94
  • [5] Workload-Aware Column Imprints
    Slavitch, Noah
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 2865 - 2867
  • [6] Workload-Aware Shortest Path Distance Querying in Road Networks
    Zheng, Bolong
    Wan, Jingyi
    Gao, Yongyong
    Ma, Yong
    Huang, Kai
    Zhou, Xiaofang
    Jensen, Christian S.
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 2372 - 2384
  • [7] Workload-aware Materialization for Efficient Variable Elimination on Bayesian Networks
    Aslay, Cigdem
    Ciaperoni, Martino
    Gionis, Aristides
    Mathioudakis, Michael
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1152 - 1163
  • [8] Minimum motif-cut: a workload-aware RDF graph partitioning strategy
    Peng, Peng
    Ji, Shengyi
    Ozsu, M. Tamer
    Zou, Lei
    VLDB JOURNAL, 2024, 33 (05): : 1517 - 1542
  • [9] Workload-aware histograms for remote applications
    Malik, Tanu
    Burns, Randal
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2008, 5182 : 402 - +
  • [10] STHoles: A multidimensional workload-aware histogram
    Bruno, N
    Chaudhuri, S
    Gravano, L
    SIGMOD RECORD, 2001, 30 (02) : 211 - 222