STARM: STreaming Association Rules Mining in High-Dimensional Data

被引:0
|
作者
Gahar, Rania Mkhinini [1 ]
Arfaoui, Olfa [2 ]
Hidri, Adel [3 ]
Alsaif, Suleiman Ali [3 ]
Hidri, Minyar Sassi [3 ]
机构
[1] Univ Tunis El Manar, Natl Engn Sch Tunis, OASIS Res Lab, Tunis, Tunisia
[2] Univ Tunis El Manar, Natl Engn Sch Tunis, RISC Res Lab, Tunis, Tunisia
[3] Imam Abdulrahman Bin Faisal Univ, Dept Comp, Deanship Preparatory Year & Supporting Studies, Dammam, Saudi Arabia
关键词
Association Rules; Dimensionality Reduction; Spark Streaming; Apriori; Sliding Window;
D O I
10.1007/978-3-031-57853-3_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Predictive analytics involves using Data Mining algorithms to discover knowledge from large databases. The Association Rules (ARs) mining technique is considered to be one of the most prevalent data mining techniques in this context. When it comes to Big Data, we talk about data stream mining which is the process of extracting knowledge from continuous data streams. In this paper, STARM (STreaming Association Rules Mining) is proposed as an efficient and distributed algorithm for mining ARs. Based on the transaction-sensitive sliding-window model, the Apriori algorithm is applied to data streams to extract frequent itemsets (FI) that are then generated into ARs via Spark streaming framework. A Dimensionality Reduction (DR) step takes place as a data preprocessing step that may reduce the search space. The conducted experiments show that the proposed streaming model achieves state-of-the-art performance.
引用
收藏
页码:136 / 146
页数:11
相关论文
共 50 条
  • [31] On eigenfunction approach to data mining: outlier detection in high-dimensional data sets
    Nagar, AK
    Muyeba, MK
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 251 - 256
  • [32] Streaming Algorithms for High-Dimensional Robust Statistics
    Diakonikolas, Ilias
    Kane, Daniel M.
    Pensia, Ankit
    Pittas, Thanasis
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [33] High-Dimensional Geometric Streaming in Polynomial Space
    Woodruff, David P.
    Yasuda, Taisuke
    2022 IEEE 63RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2022, : 732 - 743
  • [34] Change-Point Estimation of High-Dimensional Streaming Data via Sketching
    Chi, Yuejie
    Wu, Yihong
    2015 49TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2015, : 102 - 106
  • [35] Efficient locality-sensitive hashing over high-dimensional streaming data
    Wang, Hao
    Yang, Chengcheng
    Zhang, Xiangliang
    Gao, Xin
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (05): : 3753 - 3766
  • [36] Online streaming feature selection for high-dimensional small-sample data
    Kuangfeng Gong
    Guohe Li
    Lingyun Guo
    Yaojin Lin
    International Journal of Machine Learning and Cybernetics, 2025, 16 (4) : 2705 - 2719
  • [37] Efficient locality-sensitive hashing over high-dimensional streaming data
    Hao Wang
    Chengcheng Yang
    Xiangliang Zhang
    Xin Gao
    Neural Computing and Applications, 2023, 35 : 3753 - 3766
  • [38] An Adaptive Sampling Strategy for Online Monitoring and Diagnosis of High-Dimensional Streaming Data
    Gomez, Ana Maria Estrada
    Li, Dan
    Paynabar, Kamran
    TECHNOMETRICS, 2022, 64 (02) : 253 - 269
  • [39] High-dimensional data
    Geubbelmans, Melvin
    Rousseau, Axel-Jan
    Valkenborg, Dirk
    Burzykowski, Tomasz
    AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 2023, 164 (03) : 453 - 456
  • [40] High-dimensional data
    Amaratunga, Dhammika
    Cabrera, Javier
    JOURNAL OF THE NATIONAL SCIENCE FOUNDATION OF SRI LANKA, 2016, 44 (01): : 3 - 9