Bagging Recurrent Event Imputation for Repair of Imperfect Event Log With Missing Categorical Events

被引:4
|
作者
Sim, Sunghyun [1 ]
Bae, Hyerim [2 ]
Liu, Ling [3 ]
机构
[1] Pusan Natl Univ, Inst Intelligent Logist Big Data, Busan, South Korea
[2] Pusan Natl Univ, Dept Ind Engn, 30 Jan Jeon Dong, Busan 609753, South Korea
[3] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
基金
新加坡国家研究基金会;
关键词
Process mining; event log quality; missing event imputation; event chain; IMPACT; VALUES; MODELS; MICE;
D O I
10.1109/TSC.2021.3118381
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In most computing services, imperfect event logs with missing events are generated for a variety of reasons. Because missing events in imperfect event logs adversely affect the results of process mining analysis, it is essential to handle them effectively. Most existing process mining studies focus on methodologies for generation of good process models, very few methodologies, in fact, have been developed to deal with missing events. To the best of our knowledge, there is a lack of high performance methods for restoration of missing events in actual event log data. In this paper, we propose a new categorical event imputation method that can restore missing categorical events by learning the structural features between observed events in the event log. We evaluated the proposed method by way of comparative experiments with previous studies using six real datasets, and the results demonstrate that the restoration performance was greatly improved and that thereby, our proposed method can significantly improve both the quality of event logs (specifically by restoring missing events in imperfect event logs) and the overall quality of process mining analysis.
引用
收藏
页码:108 / 121
页数:14
相关论文
共 50 条
  • [1] Likelihood-based Multiple Imputation by Event Chain Methodology for Repair of Imperfect Event Logs with Missing Data
    Sim, Sunghyun
    Bae, Hyerim
    Choi, Yulim
    2019 INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2019), 2019, : 9 - 16
  • [2] Multiple imputation methods for recurrent event data with missing event category
    Schaubel, Douglas E.
    Cai, Jianwen
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2006, 34 (04): : 677 - 692
  • [3] Missing data sensitivity analysis for recurrent event data using controlled imputation
    Keene, Oliver N.
    Roger, James H.
    Hartley, Benjamin F.
    Kenward, Michael G.
    PHARMACEUTICAL STATISTICS, 2014, 13 (04) : 258 - 264
  • [4] The analysis of multivariate recurrent events with partially missing event types
    Chen, Bingshu E.
    Cook, Richard J.
    LIFETIME DATA ANALYSIS, 2009, 15 (01) : 41 - 58
  • [5] The analysis of multivariate recurrent events with partially missing event types
    Bingshu E. Chen
    Richard J. Cook
    Lifetime Data Analysis, 2009, 15
  • [6] A semiparametric additive rates model for multivariate recurrent events with missing event categories
    Ye, Peng
    Zhao, Xingqiu
    Sun, Liuquan
    Xu, Wei
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 89 : 39 - 50
  • [7] Prefix Imputation of Orphan Events in Event Stream Processing
    Zaman, Rashid
    Hassani, Marwan
    Van Dongen, Boudewijn F.
    FRONTIERS IN BIG DATA, 2021, 4
  • [8] Semiparametric estimation of the proportional rates model for recurrent events data with missing event category
    Lin, Feng-Chang
    Cai, Jianwen
    Fine, Jason P.
    Dellon, Elisabeth P.
    Esther, Charles R.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (07) : 1624 - 1639
  • [9] Event dependent sampling of recurrent events
    Kvist, Kajsa
    Andersen, Per Kragh
    Angst, Jules
    Kessing, Lars Vedel
    LIFETIME DATA ANALYSIS, 2010, 16 (04) : 580 - 598
  • [10] Event dependent sampling of recurrent events
    Kajsa Kvist
    Per Kragh Andersen
    Jules Angst
    Lars Vedel Kessing
    Lifetime Data Analysis, 2010, 16 : 580 - 598