Analysis Layer Implementation Method for a Streaming Data Processing System

被引:0
|
作者
Burdakov, Aleksey [1 ]
Grigorev, Uriy [1 ]
Ploutenko, Andrey [2 ]
Ermakov, Oleg [1 ]
机构
[1] Bauman Moscow State Tech Univ, Moscow, Russia
[2] Amur State Univ, Blagoveshchensk, Russia
关键词
Streaming Processing; Analysis Layer; Sketch; Count-Min Sketch Algorithm;
D O I
10.5220/0010465902620269
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Analysis is an important part of the widely used streaming data processing. The frequency of flow element occurrence and their values sum are calculated during analysis. The algorithms like Count-Min Sketch and others give a big error in restoring the aggregate with a large number of elements. The article proposes application of a vector matrix. Each vector has a length of 'n'. If the number of different elements approaches 'n', then the window size is automatically reduced. This allows accurate storage of the aggregate without element loss. The SELECT operator for searching in a vector array is also proposed. It allows getting various slices of the aggregated data accumulated over the window. The comparison of the developed method with the Count-Min Sketch data processing method in the Analysis Layer was performed. The experiment showed that the method based on the vector matrix more than twice reduces memory consumption. It also ensures the exact SELECT statement execution. An introduction of a floating window allows maintaining the calculation accuracy and avoiding losing records from the stream. The same query sketch-based execution error reaches 200%.
引用
收藏
页码:262 / 269
页数:8
相关论文
共 50 条
  • [1] Implementation of the distributed system for data acquisition, processing and analysis
    Kryukov, VV
    Shakheldyan, CJ
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 1741 - 1747
  • [2] IMPLEMENTATION OF A DATA PROCESSING SYSTEM
    MATTE, PJ
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF HEALTH ASSOCIATION, 1970, 19 (01): : 55 - 66
  • [3] COMPARISON OF SYSTEM PERFORMANCE FOR STREAMING DATA ANALYSIS IN IMAGE PROCESSING TASKS BY SLIDING WINDOW
    Kazanskiy, N. L.
    Protsenko, V. I.
    Serafimovich, P. G.
    [J]. COMPUTER OPTICS, 2014, 38 (04) : 804 - 810
  • [4] A Flood Prediction Method Based on Streaming Big Data Processing
    Li, Chenming
    Peng, Jianhua
    Wang, Huibin
    Yang, Simon X.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION (IEEE ICIA 2017), 2017, : 898 - 902
  • [5] Implementation of a Large Data Processing Method for Embedded System and CMOS SNR Application
    Chen, Chien-Hung
    Liao, Tai-Shan
    Hwang, Chi-Hung
    [J]. SECURITY-ENRICHED URBAN COMPUTING AND SMART GRID, 2011, 223 : 28 - 36
  • [6] Automated Design of a Parallel Distributed System for Streaming Data Processing
    Titov, Dmytro
    Doroshenko, Anatoliy
    Yatsenko, Olena
    [J]. 2016 INTERNATIONAL CONFERENCE RADIO ELECTRONICS & INFO COMMUNICATIONS (UKRMICO), 2016,
  • [7] IMPLEMENTATION OF AN OBSTETRIC DATA-PROCESSING SYSTEM
    GREENWELL, J
    MATHER, BS
    COPE, I
    [J]. MEDICAL JOURNAL OF AUSTRALIA, 1971, 1 (01) : 18 - +
  • [8] Implementation of a Community Data Processing System Based on Data Mining
    Li, Li
    [J]. JOURNAL OF ROBOTICS, 2022, 2022
  • [9] Data Streaming for Metabolomics: Accelerating Data Processing and Analysis from Days to Minutes
    Montenegro-Burke, J. Rafael
    Aisporna, Aries E.
    Benton, H. Paul
    Rinehart, Duane
    Fang, Mingliang
    Huan, Tao
    Warth, Benedikt
    Forsberg, Erica
    Abe, Brian T.
    Iyanisevic, Julijana
    Wolan, Dennis W.
    Teyton, Luc
    Lairson, Luke
    Siuzdak, Gary
    [J]. ANALYTICAL CHEMISTRY, 2017, 89 (02) : 1254 - 1259
  • [10] Query Processing for Streaming RDF Data
    Shah, Ruchita
    Pandat, Ami
    Bhise, Minal
    [J]. 2018 4TH IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (IEEE WIECON-ECE 2018), 2018, : 75 - 78