IncApprox: A Data Analytics System for Incremental Approximate Computing

被引:32
|
作者
Krishnan, Dhanya R. [1 ]
Do Le Quoc [1 ]
Bhatotia, Pramod [1 ]
Fetzer, Christof [1 ]
Rodrigues, Rodrigo [2 ,3 ]
机构
[1] Tech Univ Dresden, Dresden, Germany
[2] Univ Lisbon, IST, Lisbon, Portugal
[3] INESC ID, Lisbon, Portugal
关键词
D O I
10.1145/2872427.2883026
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Incremental and approximate computations are increasingly being adopted for data analytics to achieve low-latency execution and efficient utilization of computing resources. Incremental computation updates the output incrementally instead of re-computing everything from scratch for successive runs of a job with input changes. Approximate computation returns an approximate output for a job instead of the exact output. Both paradigms rely on computing over a subset of data items instead of computing over the entire dataset, but they differ in their means for skipping parts of the computation. Incremental computing relies on the memoization of intermediate results of sub-computations, and reusing these memoized results across jobs. Approximate computing relies on representative sampling of the entire dataset to compute over a subset of data items. In this paper, we observe that these two paradigms are complementary, and can be married together! Our idea is quite simple: design a sampling algorithm that biases the sample selection to the memoized data items from previous runs. To realize this idea, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. We implemented our algorithm in a data analytics system called INcAPPRox based on Apache Spark Streaming. Our evaluation using micro-benchmarks and real world case-studies shows that INcAPPRox achieves the benefits of both incremental and approximate computing.
引用
收藏
页码:1133 / 1144
页数:12
相关论文
共 50 条
  • [42] A Fast and Incremental Development Life Cycle for Data Analytics as a Service
    Ardagna, Claudio A.
    Bellandi, Valerio
    Ceravolo, Paolo
    Damiani, Ernesto
    Di Martino, Beniamino
    D'Angelo, Salvatore
    Esposito, Antonio
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 174 - 181
  • [43] A Novel Data Format for Approximate Arithmetic Computing
    Gao, Mingze
    Wang, Qian
    Nagendra, Akshaya S. Kankanhalli
    Qu, Gang
    [J]. 2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 390 - 395
  • [44] Reducing Data Movement with Approximate Computing Techniques
    Crago, Stephen P.
    Yeung, Donald
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC), 2016,
  • [45] Polygon Simplification for the Efficient Approximate Analytics of Georeferenced Big Data
    Al Jawarneh, Isam Mashhour
    Foschini, Luca
    Bellavista, Paolo
    [J]. SENSORS, 2023, 23 (19)
  • [46] Data Subsetting: A Data-Centric Approach to Approximate Computing
    Kim, Younghoon
    Venkataramani, Swagath
    Chandrachoodan, Nitin
    Raghunathan, Anand
    [J]. 2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 576 - 581
  • [47] Application of Big Data Analytics via Cloud Computing
    Yetis, Yunus
    Sara, Ruthvik Goud
    Erol, Berat A.
    Kaplan, Halid
    Akuzum, Abdurrahman
    Jamshidi, Mo
    [J]. 2016 WORLD AUTOMATION CONGRESS (WAC), 2016,
  • [48] FogGIS: Fog Computing for Geospatial Big Data Analytics
    Barik, Rabindra K.
    Dubey, Harishchandra
    Samaddar, Arun B.
    Gupta, Rajan D.
    Ray, Prakash K.
    [J]. 2016 IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ELECTRONICS ENGINEERING (UPCON), 2016, : 613 - 618
  • [49] Cloud Computing Platforms for Big Data Adoption and Analytics
    Hussain, Mohammad Jabed
    Alsadie, Deafallah
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (02): : 290 - 296
  • [50] Fog Computing: An Overview of Big IoT Data Analytics
    Anawar, Muhammad Rizwan
    Wang, Shangguang
    Zia, Muhammad Azam
    Jadoon, Ahmer Khan
    Akram, Umair
    Raza, Salman
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2018,