Differentially private frequent episode mining over event streams

被引:3
|
作者
Qin, Jiawen [1 ,2 ]
Wang, Jinyan [1 ,2 ]
Li, Qiyu [2 ]
Fang, Shijian [2 ]
Li, Xianxian [1 ,2 ]
Lei, Lei [3 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin, Peoples R China
[2] Guangxi Normal Univ, Sch Comp Sci & Engn, Guilin, Peoples R China
[3] Guangxi Nanning Tianchengzhiyuan Intellectual Pro, Nanning, Peoples R China
基金
中国国家自然科学基金;
关键词
Differential privacy; Frequent episode; Event streams; Privacy preservation; Real-time data mining;
D O I
10.1016/j.engappai.2022.104681
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Frequent episode mining is a wide range framework of data mining from sequential data with many applications, which is a totally short-ordered collection of event-types and unearths temporal correlations without information loss over event streams. While offering substantial benefits, directly releasing frequent episodes to the public will enormously threaten the individual's privacy. However, there is little work so far concentrating on privately frequent episode mining. In this paper, we investigate the privacy problem in mining frequent episodes from event streams due to continuous releases in successive windows and propose a real-time differentially private frequent episode mining algorithm over event streams to avoid the privacy leakage with omega-event privacy guarantee. To obtain private frequent episodes, we propose a sample-based perturbation approach, which improves the accuracy of selecting frequent episodes based on sampling databases. To reduce the privately mining time and avoid repeatedly privacy budget allocation to coincident window of adjacent releases as much as possible, we present an incremental perturbation approach according to the judgment in dissimilarity calculation mechanism. Meanwhile, in order to protect data collected from any omega successive timestamps over event streams, we employ an adaptive omega-event privacy mechanism on the basis of the dynamicity of episodes. Finally, experimental results on real-world datasets demonstrate the effectiveness and efficiency of our algorithm.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Differentially Private Frequent Itemset Mining via Transaction Splitting
    Su, Sen
    Xu, Shengzhi
    Cheng, Xiang
    Li, Zhengyi
    Yang, Fangchun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (07) : 1875 - 1891
  • [22] Differentially private release of event logs for process mining
    Elkoumy, Gamal
    Pankova, Alisa
    Dumas, Marlon
    [J]. INFORMATION SYSTEMS, 2023, 115
  • [23] Event Mining over Distributed Text Streams
    Martinez, John Calvo
    [J]. WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 745 - 746
  • [24] Process Mining over Unordered Event Streams
    Awad, Ahmed
    Weidlich, Matthias
    Sakr, Sherif
    [J]. 2020 2ND INTERNATIONAL CONFERENCE ON PROCESS MINING (ICPM 2020), 2020, : 81 - 88
  • [25] A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining
    Cheng, Xiang
    Su, Sen
    Xu, Shengzhi
    Xiong, Li
    Xiao, Ke
    Zhao, Mingxing
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (08) : 1411 - 1425
  • [26] Mining of Probabilistic Frequent Itemsets over Uncertain Data Streams
    Liu Lixin
    Zhang Xiaolin
    Zhang Huanxiang
    [J]. 2014 11TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2014, : 231 - 237
  • [27] Dynamically, mining frequent patterns over online data streams
    Liu, XJ
    Xu, HB
    Dong, YS
    Wang, YL
    Qian, JB
    [J]. PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2005, 3758 : 645 - 654
  • [28] Efficient Mining of Weighted Frequent Patterns Over Data Streams
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    [J]. HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 400 - 406
  • [29] A survey on algorithms for mining frequent itemsets over data streams
    James Cheng
    Yiping Ke
    Wilfred Ng
    [J]. Knowledge and Information Systems, 2008, 16 : 1 - 27
  • [30] Frequent itemset mining over time-sensitive streams
    Li, Hai-Feng
    Zhang, Ning
    Zhu, Jian-Ming
    Cao, Huai-Hu
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2012, 35 (11): : 2283 - 2293