Privacy-preserving Data Classification and Similarity Evaluation for Distributed Systems

被引:14
|
作者
Jia, Qi [1 ]
Guo, Linke [1 ]
Jin, Zhanpeng [1 ]
Fang, Yuguang [2 ]
机构
[1] Binghamton Univ, Dept Elect & Comp Engn, Binghamton, NY 13902 USA
[2] Univ Florida, Dept Elect & Comp Engn, Gainesville, FL 32611 USA
关键词
Privacy Preservation; Data Classification; Similarity Evaluation; Machine Learning;
D O I
10.1109/ICDCS.2016.94
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data classification is a widely used data mining technique for big data analysis. By training massive data collected from the real world, data classification helps learners discover hidden data patterns. In addition to data training, given a trained model from collected data, a user can classify whether a new incoming data belongs to an existing class; or, multiple distributed entities may collaborate to test the similarity of their trained results. However, due to data locality and privacy concerns, it is infeasible for large-scale distributed systems to share each individual's datasets with each other for data similarity check. On the one hand, the trained model is an entity's private asset and may leak private information, which should be well protected from all other non-collaborative entities. On the other hand, the new incoming data may contain sensitive information which cannot be disclosed directly for classification. To address the above privacy issues, we propose a privacy-preserving data classification and similarity evaluation scheme for distributed systems. With our scheme, neither new arriving data nor trained models are directly revealed during the classification and similarity evaluation procedures. The proposed scheme can be applied to many fields using data classification and evaluation. Based on extensive real-world experiments, we have also evaluated the privacy preservation, feasibility, and efficiency of the proposed scheme.
引用
收藏
页码:690 / 699
页数:10
相关论文
共 50 条
  • [41] Data privacy-preserving distributed knowledge discovery based on the blockchain
    Lee, Keon Myung
    Ra, Ilkyeun
    [J]. INFORMATION TECHNOLOGY & MANAGEMENT, 2020, 21 (04): : 191 - 204
  • [42] Data privacy-preserving distributed knowledge discovery based on the blockchain
    Keon Myung Lee
    Ilkyeun Ra
    [J]. Information Technology and Management, 2020, 21 : 191 - 204
  • [43] Privacy-Preserving Distributed Data Fusion Based on Attribute Protection
    Su, Xin
    Fan, Kuan
    Shi, Wenbo
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (10) : 5765 - 5777
  • [44] Privacy-preserving top-N recommendation on distributed data
    Polat, Huseyin
    Du, Wenliang
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (07): : 1093 - 1108
  • [45] Privacy-preserving hybrid collaborative filtering on cross distributed data
    Ibrahim Yakut
    Huseyin Polat
    [J]. Knowledge and Information Systems, 2012, 30 : 405 - 433
  • [46] Distributed Privacy-Preserving Aggregation of Metering Data in Smart Grids
    Rottondi, Cristina
    Verticale, Giacomo
    Krauss, Christoph
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2013, 31 (07) : 1342 - 1354
  • [47] A Privacy-Preserving Distributed Analytics Platform for Health Care Data
    Welten, Sascha
    Mou, Yongli
    Neumann, Laurenz
    Jaberansary, Mehrshad
    Ucer, Yeliz Yediel
    Kirsten, Toralf
    Decker, Stefan
    Beyan, Oya
    [J]. METHODS OF INFORMATION IN MEDICINE, 2022, 61 : E1 - E11
  • [48] Privacy-preserving Bayesian network learning on distributed heterogeneous data
    Wang, Hong-Mei
    Zeng, Yuan
    Zhao, Zheng
    Wang, Cheng-Shan
    [J]. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2007, 40 (09): : 1025 - 1028
  • [49] Privacy-Preserving and Secure Distributed Data Sharing Scheme for VANETs
    Anhui University, Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Hefei
    230039, China
    不详
    230039, China
    不详
    460018, Russia
    不详
    430072, China
    不详
    201204, China
    [J]. IEEE Trans. Mob. Comput., 2024, 12 (13882-13897): : 13882 - 13897
  • [50] Privacy-preserving hybrid collaborative filtering on cross distributed data
    Yakut, Ibrahim
    Polat, Huseyin
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (02) : 405 - 433