A comparative study of dimensionality reduction techniques to enhance trace clustering performances

被引:52
|
作者
Song, M. [1 ]
Yang, H. [1 ]
Siadat, S. H. [1 ]
Pechenizkiy, M. [2 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Sch Technol Management, Ulsan 689798, South Korea
[2] Eindhoven Univ Technol, Dept Comp Sci, NL-5612 AZ Eindhoven, Netherlands
基金
新加坡国家研究基金会;
关键词
Process mining; Trace clustering; Singular value decomposition; Random projection; PCA; SINGULAR VALUE DECOMPOSITION; RANDOM PROJECTIONS; PROCESS MODELS; CHECKING; SUPPORT;
D O I
10.1016/j.eswa.2012.12.078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Process mining techniques have been used to analyze event logs from information systems in order to derive useful patterns. However, in the big data era, real-life event logs are huge, unstructured, and complex so that traditional process mining techniques have difficulties in the analysis of big logs. To reduce the complexity during the analysis, trace clustering can be used to group similar traces together and to mine more structured and simpler process models for each of the clusters locally. However, a high dimensionality of the feature space in which all the traces are presented poses different problems to trace clustering. In this paper, we study the effect of applying dimensionality reduction (preprocessing) techniques on the performance of trace clustering. In our experimental study we use three popular feature transformation techniques; singular value decomposition (SVD), random projection (RP), and principal components analysis (PCA), and the state-of-the art trace clustering in process mining. The experimental results on the dataset constructed from a real event log recorded from patient treatment processes in a Dutch hospital show that dimensionality reduction can improve trace clustering performance with respect to the computation time and average fitness of the mined local process models. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3722 / 3737
页数:16
相关论文
共 50 条
  • [1] An Evolution and Evaluation of Dimensionality Reduction Techniques-A Comparative Study
    Snehal, Joshi K.
    Machchhar, Sahista
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 1244 - 1248
  • [2] Overview and comparative study of dimensionality reduction techniques for high dimensional data
    Ayesha, Shaeela
    Hanif, Muhammad Kashif
    Talib, Ramzan
    [J]. INFORMATION FUSION, 2020, 59 : 44 - 58
  • [3] Comparative Study of Dimensionality Reduction Techniques for Spectral-Temporal Data
    You, Shingchern D.
    Hung, Ming-Jen
    [J]. INFORMATION, 2021, 12 (01) : 1 - 12
  • [4] A Comparative Study of Different Dimensionality Reduction Techniques for Arabic Machine Translation
    Bensalah, Nouhaila
    Ayad, Habib
    Adib, Abdellah
    El Farouk, Abdelhamid Ibn
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (12)
  • [5] Comparison of Dimensionality Reduction Techniques for Clustering and Visualization of Load Profiles
    Arechiga, A.
    Barocio, E.
    Ayon, J. J.
    Garcia-Baleon, H. A.
    [J]. 2016 IEEE PES TRANSMISSION & DISTRIBUTION CONFERENCE AND EXPOSITION-LATIN AMERICA (PES T&D-LA), 2016,
  • [6] A Comparative Approach of Dimensionality Reduction Techniques in Text Classification
    Basha, Shaik Rahamat
    Rani, J. Keziya
    [J]. ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2019, 9 (06) : 4974 - 4979
  • [7] Using Dimensionality Reduction and Clustering Techniques to Classify Space Plasma Regimes
    Bakrania, Mayur R.
    Rae, I. Jonathan
    Walsh, Andrew P.
    Verscharen, Daniel
    Smith, Andy W.
    [J]. FRONTIERS IN ASTRONOMY AND SPACE SCIENCES, 2020, 7
  • [8] Comparative analysis of dimensionality reduction techniques for cybersecurity in the SWaT dataset
    Mehmet Bozdal
    Kadir Ileri
    Ali Ozkahraman
    [J]. The Journal of Supercomputing, 2024, 80 : 1059 - 1079
  • [9] Comparative analysis of dimensionality reduction techniques for cybersecurity in the SWaT dataset
    Bozdal, Mehmet
    Ileri, Kadir
    Ozkahraman, Ali
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (01): : 1059 - 1079
  • [10] Consensus Clustering for Dimensionality Reduction
    Rani, D. Sandhya
    Rani, T. Sobha
    Bhavani, S. Durga
    [J]. 2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 148 - 153