A Scalable Complex Event Analytical System with Incremental Episode Mining over Data Streams

被引:0
|
作者
Tseng, Jerry C. C. [1 ]
Gu, Jia-Yuan [1 ]
Tseng, Vincent S. [2 ]
Wang, P. F. [3 ]
Chen, Ching-Yu [3 ]
Li, Chu-Feng [3 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan, Taiwan
[2] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
[3] Inst Informat Ind, Taipei, Taiwan
关键词
Data Stream; Incremental Mining; Episode Pattern Mining; Lambda Architecture; FREQUENT EPISODES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Episode pattern mining is a very powerful technique to get high-valued information for people to solve real-life cross-disciplinary problems, such as for the analysis of manufacturing, stock markets, weather records and so on. As data grows, the mining process must be re-triggered again and again to obtain the most updated information. However, periodically re-mining the full dataset is not cost-effective, and thus a number of incremental mining approaches arise for the growing data. However, to our best knowledge, there exist few studies targeted on the problem of incremental episode mining. Moreover, streaming data of complex events is more and more popular because digital sensors always collect data around us in this big data age. Now the challenge is not only mining valuable episode patterns of incremental dataset, but also mining episode patterns over data streams of complex events. To address this research problem, we adopt the Lambda Architecture to design a scalable complex event analytical system that could be used to facilitate the incremental episode mining process over complex event sequences of data streams. Apache Spark and Apache Spark Streaming are applied as the development framework of the batch layer and the speed layer, respectively. To take both the efficiency and accuracy into consideration, we develop a series of modules and three algorithms, namely, batch episode mining, delta episode mining and pattern merging. Results from the experimental validation on a real dataset show that the proposed system carries high scalability and delivers excellent performance in terms of efficiency and accuracy.
引用
收藏
页码:648 / 655
页数:8
相关论文
共 50 条
  • [1] A Scalable Analytical Framework for Complex Event Episode Mining With Various Domains Applications
    Tseng, Jerry C. C.
    Hsieh, Sun-Yuan
    Tseng, Vincent S.
    IEEE ACCESS, 2022, 10 : 130672 - 130685
  • [2] Complex-Event Mining Over Centralized and Distributed Data Streams
    Garofalakis, Minos
    2022 4TH INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2022), 2022, : XV - XV
  • [3] Differentially private frequent episode mining over event streams
    Qin, Jiawen
    Wang, Jinyan
    Li, Qiyu
    Fang, Shijian
    Li, Xianxian
    Lei, Lei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [4] Scalable Contrast Pattern Mining over Data Streams
    Alipourchavary, Elaheh
    Erfani, Sarah M.
    Leckie, Christopher
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2842 - 2846
  • [5] Partitioning for Scalable Complex Event Processing on Data Streams
    Saleh, Omran
    Betz, Heiko
    Sattler, Kai-Uwe
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS II, 2015, 312 : 185 - 197
  • [6] Efficient Episode Mining of Dynamic Event Streams
    Patnaik, Debprakash
    Laxman, Srivatsan
    Chandramouli, Badrish
    Ramakrishnan, Naren
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 605 - 614
  • [7] Frequent episode mining within the latest time windows over event streams
    Shukuan Lin
    Jianzhong Qiao
    Ya Wang
    Applied Intelligence, 2014, 40 : 13 - 28
  • [8] Frequent episode mining within the latest time windows over event streams
    Lin, Shukuan
    Qiao, Jianzhong
    Wang, Ya
    APPLIED INTELLIGENCE, 2014, 40 (01) : 13 - 28
  • [9] Periodic Episode Discovery Over Event Streams
    Soulas, Julie
    Lenca, Philippe
    PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 : 547 - 559
  • [10] Mining serial episode rules with time lags over multiple data streams
    Lee, Tung-Ying
    Wang, En Tzu
    Chen, Arbee L. P.
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2008, 5182 : 227 - +