A Contextual Approach to Detecting Synonymous and Polluted Activity Labels in Process Event Logs

被引:12
|
作者
Sadeghianasl, Sareh [1 ]
ter Hofstede, Arthur H. M. [1 ]
Wynn, Moe T. [1 ]
Suriadi, Suriadi [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
关键词
Data quality; Process event log; Activity label;
D O I
10.1007/978-3-030-33246-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Process mining, as a well-established research area, uses algorithms for process-oriented data analysis. Similar to other types of data analysis, the existence of quality issues in input data will lead to unreliable analysis results (garbage in - garbage out). An important input for process mining is an event log which is a record of events related to a business process as it is performed through the use of an information system. While addressing quality issues in event logs is necessary, it is usually an ad-hoc and tiresome task. In this paper, we propose an automatic approach for detecting two types of data quality issues related to activities, both critical for the success of process mining studies: synonymous labels (same semantics with different syntax) and polluted labels (same semantics and same label structures). We propose the use of activity context, i.e. control flow, resource, time, and data attributes to detect semantically identical activity labels. We have implemented our approach and validated it using real-life logs from two hospitals and an insurance company, and have achieved promising results in detecting frequent imperfect activity labels.
引用
收藏
页码:76 / 94
页数:19
相关论文
共 20 条
  • [11] Humans-in-the-loop: Gamifying activity label repair in process event logs
    Sadeghianasl, Sareh
    Ter Hofstede, Arthur H. M.
    Wynn, Moe Thandar
    Turkay, Selen
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [12] An optimization-based process mining approach for explainable classification of timed event logs
    De Oliveira, Hugo
    Augusto, Vincent
    Jouaneton, Baptiste
    Lamarsalle, Ludovic
    Prodel, Martin
    Xie, Xiaolan
    [J]. 2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 43 - 48
  • [13] Process Mining Approach Based on Partial Structures of Event Logs and Decision Tree Learning
    Horita, Hiroki
    Hirayama, Hideaki
    Hayase, Takeo
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    [J]. PROCEEDINGS 2016 5TH IIAI INTERNATIONAL CONGRESS ON ADVANCED APPLIED INFORMATICS IIAI-AAI 2016, 2016, : 113 - 118
  • [14] Scalable alignment of process models and event logs: An approach based on automata and S-components
    Reissner, Daniel
    Armas-Cervantes, Abel
    Conforti, Raffaele
    Dumas, Marlon
    Fahland, Dirk
    La Rosa, Marcello
    [J]. INFORMATION SYSTEMS, 2020, 94
  • [15] Event Logs Pre-processing for Configurable Process Discovery: Ontology-Based Approach
    Khannat, Aicha
    Sbai, Hanae
    Kjiri, Laila
    [J]. 2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 139 - 144
  • [16] Automated discovery of structured process models from event logs: The discover-and-structure approach
    Augusto, Adriano
    Conforti, Raffaele
    Dumas, Marlon
    La Rosa, Marcello
    Bruno, Giorgio
    [J]. DATA & KNOWLEDGE ENGINEERING, 2018, 117 : 373 - 392
  • [17] A Visual Approach to Spot Statistically-Significant Differences in Event Logs Based on Process Metrics
    Bolt, Alfredo
    de Leoni, Massimiliano
    van der Aalst, Wil M. P.
    [J]. ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2016), 2016, 9694 : 151 - 166
  • [18] Semi-Automated Approach for Building Event Logs for Process Mining from Relational Database
    Hernandez-Resendiz, Jaciel David
    Tello-Leal, Edgar
    Ramirez-Alcocer, Ulises Manuel
    Macias-Hernandez, Barbara A.
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [19] Context-based irregular activity detection in event logs for forensic investigations: An itemset mining approach
    Khan, Saad
    Parkinson, Simon
    Murphy, Craig
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
  • [20] Optimal alignments between large event logs and process models over distributed systems: An approach based on Petri nets
    Cheng, Long
    Liu, Cong
    Zeng, Qingtian
    [J]. INFORMATION SCIENCES, 2023, 619 : 406 - 420