A Scalable Complex Event Analytical System with Incremental Episode Mining over Data Streams

被引:0
|
作者
Tseng, Jerry C. C. [1 ]
Gu, Jia-Yuan [1 ]
Tseng, Vincent S. [2 ]
Wang, P. F. [3 ]
Chen, Ching-Yu [3 ]
Li, Chu-Feng [3 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan, Taiwan
[2] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
[3] Inst Informat Ind, Taipei, Taiwan
关键词
Data Stream; Incremental Mining; Episode Pattern Mining; Lambda Architecture; FREQUENT EPISODES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Episode pattern mining is a very powerful technique to get high-valued information for people to solve real-life cross-disciplinary problems, such as for the analysis of manufacturing, stock markets, weather records and so on. As data grows, the mining process must be re-triggered again and again to obtain the most updated information. However, periodically re-mining the full dataset is not cost-effective, and thus a number of incremental mining approaches arise for the growing data. However, to our best knowledge, there exist few studies targeted on the problem of incremental episode mining. Moreover, streaming data of complex events is more and more popular because digital sensors always collect data around us in this big data age. Now the challenge is not only mining valuable episode patterns of incremental dataset, but also mining episode patterns over data streams of complex events. To address this research problem, we adopt the Lambda Architecture to design a scalable complex event analytical system that could be used to facilitate the incremental episode mining process over complex event sequences of data streams. Apache Spark and Apache Spark Streaming are applied as the development framework of the batch layer and the speed layer, respectively. To take both the efficiency and accuracy into consideration, we develop a series of modules and three algorithms, namely, batch episode mining, delta episode mining and pattern merging. Results from the experimental validation on a real dataset show that the proposed system carries high scalability and delivers excellent performance in terms of efficiency and accuracy.
引用
收藏
页码:648 / 655
页数:8
相关论文
共 50 条
  • [31] Ranking Support for Matched Patterns over Complex Event Streams: the CEPR System
    Gu, Jiaqi
    Wang, Jin
    Zaniolo, Carlo
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1354 - 1357
  • [32] Real-Time Data Mining for Event Streams
    Roudjane, Massiva
    Rebaine, Djamal
    Khoury, Raphael
    Halle, Sylvain
    2018 IEEE 22ND INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2018), 2018, : 123 - 134
  • [33] Event-based compression and mining of data streams
    Cuzzocrea, Alfredo
    Chakravarthy, Sharma
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2008, 5178 : 670 - +
  • [34] Incremental subspace clustering over multiple data streams
    Zhang, Qi
    Liu, Jinze
    Wang, Wei
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 727 - 732
  • [35] An Incremental Anytime Algorithm for Mining T-Patterns from Event Streams
    Johnson, Keith
    Liu, Wei
    DATA MINING, AUSDM 2017, 2018, 845 : 144 - 157
  • [36] Event metadata records as a testbed for scalable data mining
    van Gemmeren, P.
    Malon, D.
    17TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP09), 2010, 219
  • [37] Processing and mining complex data streams Preface
    Stefanowski, Jerzy
    Cuzzocrea, Alfredo
    Slezak, Dominik
    INFORMATION SCIENCES, 2014, 285 : 63 - 65
  • [38] An analytical framework for event mining in video data
    Maryam Koohzadi
    Mohammad Reza Keyvanpour
    Artificial Intelligence Review, 2014, 41 : 401 - 413
  • [39] An analytical framework for event mining in video data
    Koohzadi, Maryam
    Keyvanpour, Mohammad Reza
    ARTIFICIAL INTELLIGENCE REVIEW, 2014, 41 (03) : 401 - 413
  • [40] A tree-based approach for event prediction using episode rules over event streams
    Cho, Chung-Wen
    Zheng, Ying
    Wu, Yi-Hung
    Chen, Arbee L. P.
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2008, 5181 : 225 - +