Efficient Transmission and Reconstruction of Dependent Data Streams via Edge Sampling

被引:2
|
作者
Wolfrath, Joel [1 ]
Chandra, Abhishek [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
关键词
Stream processing; edge computing; big data; approximate computing;
D O I
10.1109/IC2E55432.2022.00013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data stream processing is an increasingly important topic due to the prevalence of smart devices and the demand for real-time analytics. Geo-distributed streaming systems, where cloud-based queries utilize data streams from multiple distributed devices, face challenges since wide-area network (WAN) bandwidth is often scarce or expensive. Edge computing allows us to address these bandwidth costs by utilizing resources close to the devices, e.g. to perform sampling over the incoming data streams, which trades downstream query accuracy to reduce the overall transmission cost. In this paper, we leverage the fact that correlations between data streams may exist across devices located in the same geographical region. Using this insight, we develop a hybrid edge-cloud system which systematically trades off between sampling at the edge and estimation of missing values in the cloud to reduce traffic over the WAN. We present an optimization framework which computes sample sizes at the edge and systematically bounds the number of samples we can estimate in the cloud given the strength of the correlation between streams. Our evaluation with three real-world datasets shows that compared to existing sampling techniques, our system could provide comparable error rates over multiple aggregate queries while reducing WAN traffic by 27-42%.
引用
收藏
页码:47 / 57
页数:11
相关论文
共 50 条
  • [41] Global triangle estimation based on first edge sampling in large graph streams
    Yu, Changyong
    Liu, Huimin
    Wahab, Fazal
    Ling, Zihan
    Ren, Tianmei
    Ma, Haitao
    Zhao, Yuhai
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (13): : 14079 - 14116
  • [42] Efficient AutoML via Combinational Sampling
    Duc Anh Nguyen
    Kononova, Anna, V
    Menzel, Stefan
    Sendhoff, Bernhard
    Baeck, Thomas
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [43] Efficient incremental subspace clustering in data streams
    Kontaki, Maria
    Papadopoulos, Apostolos N.
    Manolopoulos, Yannis
    10TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2006, : 53 - 60
  • [44] An Efficient Itemset Mining Approach for Data Streams
    Baralis, Elena
    Cerquitelli, Tania
    Chiusano, Silvia
    Grand, Alberto
    Grimaudo, Luigi
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 515 - 523
  • [45] Efficient object tracking in WAAS data streams
    Clarke, Trevor R. H.
    Canosa, Roxanne
    REAL-TIME IMAGE AND VIDEO PROCESSING 2011, 2011, 7871
  • [46] Towards Efficient KNN Joins on Data Streams
    Yang, Chong
    Yu, Xiaohui
    Liu, Yang
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 782 - 783
  • [47] Efficient approximation of correlated sums on data streams
    Ananthakrishna, R
    Das, A
    Gehrke, J
    Korn, F
    Muthukrishnan, S
    Srivastava, D
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (03) : 569 - 572
  • [48] Efficient Aggregation Methods for Probabilistic Data Streams
    Goman, Maksim
    BUSINESS MODELING AND SOFTWARE DESIGN, BMSD 2018, 2018, 319 : 116 - 132
  • [49] Efficient Optimized Query Mesh for Data Streams
    Mohamed, Fatma
    Ismail, Rasha
    Badr, Nagwa
    Tolba, Mohamed Fahmy
    2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2014, : 157 - 163
  • [50] An efficient strategy for finding the patterns of data streams
    Jiao, F
    He, GM
    Proceedings of the 11th Joint International Computer Conference, 2005, : 617 - 620