Anonymizing k-NN Classification on MapReduce

被引:5
|
作者
Bazai, Sibghat Ullah [1 ]
Jang-Jaccard, Julian [1 ]
Wang, Ruili [1 ]
机构
[1] Massey Univ, Inst Nat & Math Sci, Auckland, New Zealand
关键词
MapReduce; Data anonymization; K-anonymity; k-NN classification; PRIVACY;
D O I
10.1007/978-3-319-90775-8_29
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data analytics scenario such as a classification algorithm plays an important role in data mining to identify a category of a new observation and is often used to drive new knowledge. However, classification algorithm on a big data analytics platform such as MapReduce and Spark, often runs on plain text without an appropriate privacy protection mechanism. This leaves user's data to be vulnerable from unauthorized access and puts the data at a great privacy risk. To address such concern, we propose a new novel k-NN classifier which can run on an anonymized dataset on MapReduce platform. We describe new Map and Reduce algorithms to produce different anonymized datasets for k-NN classifier. We also illustrate the details of experiments we performed on the multiple anonymized data sets to understand the effects between the level of privacy protection (data privacy) and the high-value insights (data utility) trade-off before and after data anonymization.
引用
收藏
页码:364 / 377
页数:14
相关论文
共 50 条
  • [1] A MapReduce Based k-NN Joins Probabilistic Classifier
    Chatzigeorgakidis, Georgios
    Karagiorgou, Sophia
    Athanasiou, Spiros
    Skiadopoulos, Spiros
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 952 - 957
  • [2] Leveraging k-NN for generic classification boosting
    Piro, Paolo
    Nock, Richard
    Nielsen, Frank
    Barlaud, Michel
    NEUROCOMPUTING, 2012, 80 : 3 - 9
  • [3] A modification of the LAESA algorithm for approximated k-NN classification
    Moreno-Seco, F
    Micó, L
    Oncina, J
    PATTERN RECOGNITION LETTERS, 2003, 24 (1-3) : 47 - 53
  • [4] Fast k-NN classification for multichannel image data
    Warfield, S
    PATTERN RECOGNITION LETTERS, 1996, 17 (07) : 713 - 721
  • [5] Regional Distance-based k-NN Classification
    Aung, Swe Swe
    Nagayama, Itaru
    Tamaki, Shiro
    2017 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS), 2017, : 56 - 62
  • [6] Multi-view evidential K-NN classification
    Gong, Chaoyu
    Su, Zhi-gang
    Denoeux, Thierry
    Information Fusion, 2025, 120
  • [7] Succinct matrix approximation and efficient k-NN classification
    Liu, Rong
    Shi, Yong
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 213 - +
  • [8] Selection of Relevant Features for Text Classification with K-NN
    Balicki, Jerzy
    Krawczyk, Henryk
    Rymko, Lukasz
    Szymanski, Julian
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2013, 7895 : 477 - 488
  • [9] Improving the k-NN and applying it to Chinese text classification
    Yuan, F
    Yang, L
    Yu, G
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 1547 - 1553
  • [10] <bold>AN OPTIMIZATION ALGORITHM OF K-NN CLASSIFICATION</bold>
    Zhan, Yan
    Chen, Hao
    Zhang, Guo-Chun
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2246 - +