Efficient approximation and privacy preservation algorithms for real time online evolving data streams

被引:4
|
作者
Patil, Rahul A. [1 ,2 ]
Patil, Pramod D. [1 ]
机构
[1] Dr D Y Patil Inst Technol, Pimpri Pune 411018, Maharashtra, India
[2] Pimpri Chinchwad Coll Engn, Pune 411044, Maharashtra, India
关键词
Approximation; Data streaming; Clustering; k-anonymization; l-diversity; Privacy preservation; ANONYMIZATION;
D O I
10.1007/s11280-024-01244-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Efficient approximation and privacy preservation algorithms for real time online evolving data streams
    Rahul A. Patil
    Pramod D. Patil
    World Wide Web, 2024, 27
  • [2] Clustering over Evolving Data Streams Based on Online Recent-Biased Approximation
    Fan, Wei
    Koyanagi, Yusuke
    Asakura, Koichi
    Watanabe, Toyohide
    KNOWLEDGE ACQUISITION: APPROACHES, ALGORITHMS AND APPLICATIONS, 2009, 5465 : 12 - +
  • [3] Efficient Data Streams Processing in the Real Time Data Warehouse
    Majeed, Fiaz
    Mahmood, Muhammad Sohaib
    Iqbal, Mujahid
    PROCEEDINGS OF 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (ICCSIT 2010), VOL 5, 2010, : 57 - 61
  • [4] Space-efficient Online Approximation of Time Series Data: Streams, Amnesia, and Out-of-order
    Gandhi, Sorabh
    Foschini, Luca
    Suri, Subhash
    26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 924 - 935
  • [5] Hiding in the crowd: Privacy preservation on evolving streams through correlation tracking
    Li, Feifei
    Sun, Jimeng
    Papadimitriou, Spiros
    Mihaila, George A.
    Stanoi, Ioana
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 661 - +
  • [6] An efficient online histogram publication method for data streams with local differential privacy
    Tao, Tao
    Zhang, Funan
    Wang, Xiujun
    Zheng, Xiao
    Zhao, Xin
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2024, 25 (08) : 1096 - 1109
  • [7] Evolving fuzzy systems from data streams in real-time
    Angelov, Plamen
    Zhou, Xiaowei
    2006 INTERNATIONAL SYMPOSIUM ON EVOLVING FUZZY SYSTEMS, PROCEEDINGS, 2006, : 29 - +
  • [8] Online embedding and clustering of evolving data streams
    Zubaroglu, Alaettin
    Atalay, Volkan
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (01) : 29 - 44
  • [9] Online Clustering for Evolving Data Streams with Online Anomaly Detection
    Chenaghlou, Milad
    Moshtaghi, Masud
    Leckie, Christopher
    Salehi, Mahsa
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 506 - 519
  • [10] Dynamically Evolving Fuzzy Classifier for Real-time Classification of Data Streams
    Baruah, Rashmi Dutta
    Angelov, Plamen
    Baruah, Diganta
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 383 - 389