A Graph-Based Differentially Private Algorithm for Mining Frequent Sequential Patterns

被引:1
|
作者
Nunez-del-Prado, Miguel [1 ,2 ]
Maehara-Aliaga, Yoshitomi [1 ]
Salas, Julian [3 ,4 ]
Alatrista-Salas, Hugo [5 ]
Megias, David [4 ,6 ]
机构
[1] Peru Res Dev & Innovat PERU IDI, Lima 15047, Peru
[2] Univ Andina Cusco, Inst Invest, Cuzco 08006, Peru
[3] Univ Rovira & Virgili URV, Dept Engn Informat & Matemat, Tarragona 43007, Spain
[4] Ctr Cybersecur Res Catalonia CYBERCAT, Barcelona 08860, Spain
[5] Pontificia Univ Catolica Peru PUCP, Sch Sci & Engn, Lima 5088, Peru
[6] Univ Oberta Catalunya UOC, Internet Interdisciplinary Inst IN3, Barcelona 08860, Spain
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 04期
关键词
sequential pattern mining; differential privacy; frequent pattern mining; edge differential privacy; graph differential privacy; anonymization of big data;
D O I
10.3390/app12042131
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Currently, individuals leave a digital trace of their activities when they use their smartphones, social media, mobile apps, credit card payments, Internet surfing profile, etc. These digital activities hide intrinsic usage patterns, which can be extracted using sequential pattern algorithms. Sequential pattern mining is a promising approach for discovering temporal regularities in huge and heterogeneous databases. These sequences represent individuals' common behavior and could contain sensitive information. Thus, sequential patterns should be sanitized to preserve individuals' privacy. Hence, many algorithms have been proposed to accomplish this task. However, these techniques add noise to the candidate support before they are validated as, frequently, and thus, they cannot be applied without having access to all the users' sequences data. In this paper, we propose a differential privacy graph-based technique for publishing frequent sequential patterns. It is applied at the post-processing stage; hence it may be used to protect frequent sequential patterns after they have been extracted, without the need to access all the users' sequences. To validate our proposal, we performed a detailed assessment of its utility as a pattern mining algorithm and calculated the impact of the sanitization mechanism on a recommender system. We further evaluated its information loss disclosure risk and performed a comparison with the DP-FSM algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Differentially Private Frequent Itemset Mining via Transaction Splitting
    Su, Sen
    Xu, Shengzhi
    Cheng, Xiang
    Li, Zhengyi
    Yang, Fangchun
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1564 - 1565
  • [42] GASP: Graph-Based Approximate Sequential Pattern Mining for Electronic Health Records
    Dong, Wenqin
    Lee, Eric W.
    Hertzberg, Vicki Stover
    Simpson, Roy L.
    Ho, Joyce C.
    [J]. NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2021, 2021, 1450 : 50 - 60
  • [43] Differentially Private Frequent Itemset Mining Against Incremental Updates
    Liang, Wenjuan
    Chen, Hong
    Wu, Yuncheng
    Li, Cuiping
    [J]. INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2019), 2020, 11999 : 649 - 667
  • [44] A fast algorithm for mining frequent patterns
    Ruan, YL
    Zhang, JJ
    Li, QH
    Yang, SD
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1683 - 1686
  • [45] Differentially Private Frequent Itemset Mining via Transaction Splitting
    Su, Sen
    Xu, Shengzhi
    Cheng, Xiang
    Li, Zhengyi
    Yang, Fangchun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (07) : 1875 - 1891
  • [46] WTPMiner: Efficient mining of weighted frequent patterns based on graph traversals
    Geng, Runian
    Xu, Wenbo
    Dong, Xiangjun
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 412 - +
  • [47] Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning
    Xu, Shengzhi
    Su, Sen
    Cheng, Xiang
    Li, Zhengyi
    Xiong, Li
    [J]. 2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 1035 - 1046
  • [48] An efficient graph-based multi-relational data mining algorithm
    Guo, Jingfeng
    Zheng, Lizhen
    Li, Tieying
    [J]. CIS: 2007 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY, PROCEEDINGS, 2007, : 176 - +
  • [49] An Algorithm of Frequent Patterns Mining Based on Binary Information Granule
    Fang, G.
    Wu, Y.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL APPLICATIONS (CISIA 2015), 2015, 18 : 47 - 50
  • [50] Research on a Graph-Based Algorithm
    Dai, Shang-ping
    Duan Xin
    [J]. PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 1, 2008, : 17 - 20