Self-Adaptive Framework for Efficient Stream Data Classification on Storm

被引:10
|
作者
Deng, Shizhuo [1 ]
Wang, Botao [1 ]
Huang, Shan [1 ]
Yue, Chuncheng [1 ]
Zhou, Jianpeng [1 ]
Wang, Guoren [1 ]
机构
[1] Northeastern Univ, Sch Engn & Comp Sci, Shenyang 110004, Peoples R China
关键词
Storms; Training; Learning systems; Proposals; Throughput; Artificial neural networks; Real-time systems; Classification; extreme learning machine (ELM); partition strategy; Storm; stream data; EXTREME LEARNING-MACHINE; SYSTEM; IMPLEMENTATION;
D O I
10.1109/TSMC.2017.2757029
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this era of big data, stream data classification which is one of typical data stream applications has become more and more significant and challengeable. In these applications, it is obvious that data classification is much more frequent than model training. The ratio of stream data to be classified is rapid and time-varying, so it is an important problem to classify the stream data efficiently with high throughput. In this paper, we first analyze and categorize the current data stream machine learning algorithms according to their data structures. Then, we propose stream data classification topology (SDC-Topology) on Storm. For the classification algorithms based on the matrix, we propose self-adaptive stream data classification framework (SASDC-Framework) for efficient stream data classification on Storm. In SASDC-Framework, all the data sets arriving at the same unit time are partitioned into subsets with the nearly best partition size and processed in parallel. To select the nearly best partition size for the stream data sets efficiently, we adopt bisection method strategy and inverse distance weighted strategy. Extreme learning machine, which is a fast and accurate machine learning method based on matrix calculating, is used to test the efficiency of our proposals. According to evaluation results, the throughputs based on SASDC-Framework are 8-35 times higher than those based on SDC-Topology and the best throughput is more than 40000 prediction requests per second in our environment.
引用
收藏
页码:123 / 136
页数:14
相关论文
共 50 条
  • [1] ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering
    Li, Yanni
    Li, Hui
    Wang, Zhi
    Liu, Bing
    Cui, Jiangtao
    Fei, Hang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 617 - 630
  • [2] Research on Self-Adaptive Stream Data Mining
    Xiao, Fang
    [J]. 2016 INTERNATIONAL CONGRESS ON COMPUTATION ALGORITHMS IN ENGINEERING (ICCAE 2016), 2016, : 1 - 7
  • [3] ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering (Extended Abstract)
    Li, Yanni
    Li, Hui
    Wang, Zhi
    Liu, Bing
    Cui, Jiangtao
    Fei, Hang
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2329 - +
  • [4] MPR: An MPI Framework for Distributed Self-adaptive Stream Processing
    Loff, Junior
    Griebler, Dalvan
    Fernandes, Luiz Gustavo
    Binder, Walter
    [J]. EURO-PAR 2024: PARALLEL PROCESSING, PT III, EURO-PAR 2024, 2024, 14803 : 400 - 414
  • [5] A Self-Adaptive Antialiasing Framework for Seismic Data Interpolation
    Wang, Yuqing
    Lu, Wenkai
    Li, Yinshuo
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [6] An on-line interactive self-adaptive image classification framework
    Sannen, Davy
    Nuttin, Marnix
    Smith, Jim
    Tahir, Muhammad Atif
    Caleb-Solly, Praminda
    Lughofer, Edwin
    Eitzinger, Christian
    [J]. COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 171 - 180
  • [7] A SPARSE GREEDY SELF-ADAPTIVE ALGORITHM FOR CLASSIFICATION OF DATA
    Srivastava, Ankur
    Meade, Andrew J.
    [J]. ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2010, 2 (01) : 97 - 114
  • [8] Self-Adaptive Anytime Stream Clustering
    Kranen, Philipp
    Assent, Ira
    Baldauf, Corinna
    Seidl, Thomas
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 249 - +
  • [9] A Resource-Efficient Monitoring Architecture for Hardware Accelerated Self-Adaptive Online Data Stream Compression
    Najmabadi, Seyyed Mandi
    Pandit, Prajwala
    Trung-Hieu Tran
    Simon, Sven
    [J]. 2017 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA 2017), 2017, : 222 - 227
  • [10] OPOSSAM: Online Prediction of Stream Data Using Self-adaptive Memory
    Yamaguchi, Akihiro
    Maya, Shigeru
    Inagi, Tatsuya
    Ueno, Ken
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2355 - 2364