Compliant Geo-distributed Data Processing in Action

被引:2
|
作者
Beedkar, Kaustubh [1 ]
Brekardin, David [1 ]
Quiane-Ruiz, Jorge-Anulfo [1 ,2 ]
Markl, Volker [1 ,2 ]
机构
[1] TU Berlin, Berlin, Germany
[2] DFKI, Kaiserslautern, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 14卷 / 12期
关键词
D O I
10.14778/3476311.3476359
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present our work on compliant geo distributed data processing. Our work focuses on the new dimension of dataflow constraints that regulate the movement of data across geographical or institutional borders. For example, European directives may regulate transferring only certain information fields (such as non personal information) or aggregated data. Thus, it is crucial for distributed data processing frameworks to consider compliance with respect to dataflow constraints derived from these regulations. We have developed a compliance-based data processing framework, which (i) allows for the declarative specification of dataflow constraints, (ii) determines if a query can be translated into a compliant distributed query execution plan, and (iii) executes the compliant plan over distributed SQL databases. We demonstrate our framework using a geo-distributed adaptation of the TPC-H benchmark data. Our framework provides an interactive dashboard, which allows users to specify dataflow constraints, and analyze and execute compliant distributed query execution plans.
引用
收藏
页码:2843 / 2846
页数:4
相关论文
共 50 条
  • [1] Compliant Geo-distributed Query Processing
    Beedkar, Kaustubh
    Quiane-Ruiz, Jorge-Arnulfo
    Markl, Volker
    [J]. SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 181 - 193
  • [2] Efficient Geo-Distributed Data Processing with Rout
    Jayalath, Chamikara
    Eugster, Patrick
    [J]. 2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 470 - 480
  • [3] Cost Minimization for Big Data Processing in Geo-Distributed Data Centers
    Gu, Lin
    Zeng, Deze
    Li, Peng
    Guo, Song
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) : 314 - 323
  • [4] Towards Efficient Graph Processing in Geo-Distributed Data Centers
    Yao, Feng
    Tao, Qian
    Lin, Shengyuan
    Zhang, Yanfeng
    Yu, Wenyuan
    Gong, Shufeng
    Wang, Qiange
    Yu, Ge
    Zhou, Jingren
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (11) : 2147 - 2160
  • [5] On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters
    Zhou, Amelie Chi
    Ibrahim, Shadi
    He, Bingsheng
    [J]. 2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 1397 - 1407
  • [6] Harmony: An Approach for Geo-distributed Processing of Big-Data Applications
    Zhang, Han
    Ramapantulu, Lavanya
    Teo, Yong Meng
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2019, : 160 - 170
  • [7] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2015, 45 (04) : 421 - 434
  • [8] Cost-Aware Big Data Processing Across Geo-Distributed Datacenters
    Xiao, Wenhua
    Bao, Weidong
    Zhu, Xiaomin
    Liu, Ling
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (11) : 3114 - 3127
  • [9] Accelerating Geo-Distributed Transaction Processing with Fast Logging
    Ogura, Takuto
    Akita, Yoshiki
    Miyazawa, Yuki
    Kawashima, Hideyuki
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2390 - 2399
  • [10] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    [J]. SIGCOMM'15: PROCEEDINGS OF THE 2015 ACM CONFERENCE ON SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2015, : 421 - 434