SIESTA: A Scalable Infrastructure of Sequential Pattern Analysis

被引:3
|
作者
Mavroudopoulos, Ioannis [1 ]
Gounaris, Anastasios [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
Big Data; Task analysis; Pattern analysis; Scalability; Databases; Indexing; Time factors; Big data; pattern detection; scalable infrastructure; sequential pattern analysis; EFFICIENT; DISCOVERY; PARALLEL;
D O I
10.1109/TBDATA.2022.3229092
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sequential pattern analysis has become a mature topic with a lot of techniques for a variety of sequential pattern mining-related problems. Moreover, tailored solutions for specific domains, such as business process mining, have been developed. However, there is a gap in the literature for advanced techniques for efficient detection of arbitrary sequences in large collections of activity logs. In this work, we introduce the SIESTA (Scalable infrastructure of sequential pattern analysis) solution making a threefold contribution: (i) we employ a novel architecture that relies on inverted indices during preprocessing and we introduce an advanced query processor that can detect and explore arbitrary patterns efficiently; (ii) we discuss and evaluate different configurations to optimize both the preprocessing and the querying phase; and (iii) we present evaluation results competing against representatives of the state-of-the-art with a focus on Big Data. The experimental results are particularly encouraging, e.g., when all methods are deployed in a cluster and the volume of the data is increased,SIESTA creates the indices in almost half the time compared to the state-of-the-art Elasticsearch-based solution, while also yielding faster query responses than all its competitors by up to 1 order of magnitude.
引用
收藏
页码:975 / 990
页数:16
相关论文
共 50 条
  • [1] A scalable sequential pattern mining algorithm
    Wang, Jiahong
    Asanuma, Yoshiaki
    Kodama, Eiichiro
    Takata, Toyoo
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1-3, 2006, : 437 - +
  • [2] Scalable and parallel sequential pattern mining using spark
    Xiao Yu
    Qing Li
    Jin Liu
    [J]. World Wide Web, 2019, 22 : 295 - 324
  • [3] Scalable and parallel sequential pattern mining using spark
    Yu, Xiao
    Li, Qing
    Liu, Jin
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (01): : 295 - 324
  • [4] SIESTA: A scalable iterative equilibrium solver for toroidal applications
    Hirshman, S. P.
    Sanchez, R.
    Cook, C. R.
    [J]. PHYSICS OF PLASMAS, 2011, 18 (06)
  • [5] Distributed and scalable sequential pattern mining through stream processing
    Chun-Chieh Chen
    Hong-Han Shuai
    Ming-Syan Chen
    [J]. Knowledge and Information Systems, 2017, 53 : 365 - 390
  • [6] Distributed and scalable sequential pattern mining through stream processing
    Chen, Chun-Chieh
    Shuai, Hong-Han
    Chen, Ming-Syan
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (02) : 365 - 390
  • [7] Sequential pattern miner for alert pattern analysis
    No, GY
    Shin, MS
    Ryu, KH
    Kim, JS
    [J]. Proceedings of the Eighth IASTED International Conference on Internet and Multimedia Systems and Applications, 2004, : 249 - 254
  • [8] SEPAN/SEQUENTIAL PATTERN ANALYSIS
    HUFF, EM
    BURGENBA.S
    [J]. BEHAVIORAL SCIENCE, 1969, 14 (04): : 342 - &
  • [9] Highly Scalable Sequential Pattern Mining Based on MapReduce Model on the Cloud
    Chen, Chun-Chieh
    Tseng, Chi-Yao
    Chen, Ming-Syan
    [J]. 2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 310 - 317
  • [10] Scalable Sequential Pattern Mining Based on PrefixSpan for High Dimensional Data
    Akbar, Muhammad Nur
    Saptawati, G. A. Putri
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2016,