Differentially Private K-Means Clustering Applied to Meter Data Analysis and Synthesis

被引:11
|
作者
Ravi, Nikhil [1 ]
Scaglione, Anna [1 ]
Kadam, Sachin [2 ,3 ]
Gentz, Reinhard [4 ,5 ]
Peisert, Sean [4 ]
Lunghino, Brent [6 ]
Levijarvi, Emmanuel [7 ]
Shumavon, Aram [8 ]
机构
[1] Cornell Tech, Dept Elect & Comp Engn, New York, NY 10044 USA
[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
[3] Sungkyunkwan Univ, Suwon 16419, Gyeonggi, South Korea
[4] Lawrence Berkeley Natl Lab, Computat Res, Berkeley, CA 94720 USA
[5] Amazon, Networking Dept, Seattle, WA 98170 USA
[6] Kevala Inc, Data Sci & Methodol Implementat, San Francisco, CA 94133 USA
[7] Kevala Inc, Software Engn Dept, San Francisco, CA 94133 USA
[8] Kevala Inc, San Francisco, CA 94133 USA
关键词
Differential privacy; clustering; smart grids; summary statistics; synthetic load generation; NOISE;
D O I
10.1109/TSG.2022.3184252
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The proliferation of smart meters has resulted in a large amount of data being generated. It is increasingly apparent that methods are required for allowing a variety of stakeholders to leverage the data in a manner that preserves the privacy of the consumers. The sector is scrambling to define policies, such as the so called '15/15 rule', to respond to the need. However, the current policies fail to adequately guarantee privacy. In this paper, we address the problem of allowing third parties to apply K-means clustering, obtaining customer labels and centroids for a set of load time series by applying the framework of differential privacy. We leverage the method to design an algorithm that generates differentially private synthetic load data consistent with the labeled data. We test our algorithm's utility by answering summary statistics such as average daily load profiles for a 2-dimensional synthetic dataset and a real-world power load dataset.
引用
收藏
页码:4801 / 4814
页数:14
相关论文
共 50 条
  • [31] k-Means Clustering of Lines for Big Data
    Marom, Yair
    Feldman, Dan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [32] K-Means Extensions for Clustering Categorical Data
    Alwersh, Mohammed
    Kovacs, Laszlo
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 492 - 507
  • [33] New k-Means data clustering approach
    College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, China
    不详
    不详
    J. Comput. Inf. Syst., 2008, 2 (565-570):
  • [34] K-means*: Clustering by gradual data transformation
    Malinen, Mikko I.
    Mariescu-Istodor, Radu
    Franti, Pasi
    PATTERN RECOGNITION, 2014, 47 (10) : 3376 - 3386
  • [35] Data decomposition for parallel K-means clustering
    Gursoy, A
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 241 - 248
  • [36] Data Analysis of Educational Evaluation Using K-Means Clustering Method
    Liu, Rui
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [37] A Missing Data Complement Method Based on K-means Clustering Analysis
    Shi, Pengjia
    Zhang, Linyao
    2017 IEEE CONFERENCE ON ENERGY INTERNET AND ENERGY SYSTEM INTEGRATION (EI2), 2017,
  • [38] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [39] Differentially Private k-Means via Exponential Mechanism and Max Cover
    Nguyen, Huy L.
    Chaturvedi, Anamay
    Xu, Eric Z.
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9101 - 9108
  • [40] Applied Comparison of DBSCAN, OPTICS and K-Means Clustering Algorithms
    Bilgin, Turgay Tugay
    Camurcu, Yilmaz
    JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2005, 8 (02): : 139 - 145