A Contextual Approach to Detecting Synonymous and Polluted Activity Labels in Process Event Logs

被引:12
|
作者
Sadeghianasl, Sareh [1 ]
ter Hofstede, Arthur H. M. [1 ]
Wynn, Moe T. [1 ]
Suriadi, Suriadi [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
关键词
Data quality; Process event log; Activity label;
D O I
10.1007/978-3-030-33246-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Process mining, as a well-established research area, uses algorithms for process-oriented data analysis. Similar to other types of data analysis, the existence of quality issues in input data will lead to unreliable analysis results (garbage in - garbage out). An important input for process mining is an event log which is a record of events related to a business process as it is performed through the use of an information system. While addressing quality issues in event logs is necessary, it is usually an ad-hoc and tiresome task. In this paper, we propose an automatic approach for detecting two types of data quality issues related to activities, both critical for the success of process mining studies: synonymous labels (same semantics with different syntax) and polluted labels (same semantics and same label structures). We propose the use of activity context, i.e. control flow, resource, time, and data attributes to detect semantically identical activity labels. We have implemented our approach and validated it using real-life logs from two hospitals and an insurance company, and have achieved promising results in detecting frequent imperfect activity labels.
引用
收藏
页码:76 / 94
页数:19
相关论文
共 18 条
  • [1] A Deep Learning Approach for Repairing Missing Activity Labels in Event Logs for Process Mining
    Lu, Yang
    Chen, Qifan
    Poon, Simon K.
    [J]. INFORMATION, 2022, 13 (05)
  • [2] Collaborative and Interactive Detection and Repair of Activity Labels in Process Event Logs
    Sadeghianasl, Sareh
    ter Hofstede, Arthur H. M.
    Suriadi, Suriadi
    Turkay, Selen
    [J]. 2020 2ND INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2020), 2020, : 41 - 48
  • [3] A Multi-View Framework to Detect Redundant Activity Labels for More Representative Event Logs in Process Mining
    Chen, Qifan
    Lu, Yang
    Tam, Charmaine S.
    Poon, Simon K.
    [J]. FUTURE INTERNET, 2022, 14 (06):
  • [4] Detecting anomalies in business process event logs using statistical leverage
    Ko, Jonghyeon
    Comuzzi, Marco
    [J]. INFORMATION SCIENCES, 2021, 549 : 53 - 67
  • [5] Process Activity Ontology Learning From Event Logs Through Gamification
    Sadeghianasl, Sareh
    Ter Hofstede, Arthur H. M.
    Wynn, Moe Thandar
    Turkay, Selen
    Myers, Trina
    [J]. IEEE ACCESS, 2021, 9 : 165865 - 165880
  • [6] A Profile Clustering Based Event Logs Repairing Approach for Process Mining
    Xu, Jiuyun
    Liu, Jie
    [J]. IEEE ACCESS, 2019, 7 : 17872 - 17881
  • [7] Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs
    Suriadi, S.
    Andrews, R.
    ter Hofstede, A. H. M.
    Wynn, M. T.
    [J]. INFORMATION SYSTEMS, 2017, 64 : 132 - 150
  • [8] Process scenario discovery from event logs based on activity and timing information
    Zhang, Zhenyu
    Johnson, Caleb
    Venkatasubramanian, Nalini
    Ren, Shangping
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 125
  • [9] Humans-in-the-loop: Gamifying activity label repair in process event logs
    Sadeghianasl, Sareh
    Ter Hofstede, Arthur H. M.
    Wynn, Moe Thandar
    Turkay, Selen
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [10] An optimization-based process mining approach for explainable classification of timed event logs
    De Oliveira, Hugo
    Augusto, Vincent
    Jouaneton, Baptiste
    Lamarsalle, Ludovic
    Prodel, Martin
    Xie, Xiaolan
    [J]. 2020 IEEE 16TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2020, : 43 - 48