Differentially Private K-Means Clustering Applied to Meter Data Analysis and Synthesis

被引：11

作者：

Ravi, Nikhil ^{[1
]}

Scaglione, Anna ^{[1
]}

Kadam, Sachin ^{[2
,3
]}

Gentz, Reinhard ^{[4
,5
]}

Peisert, Sean ^{[4
]}

Lunghino, Brent ^{[6
]}

Levijarvi, Emmanuel ^{[7
]}

Shumavon, Aram ^{[8
]}

机构：

[1] Cornell Tech, Dept Elect & Comp Engn, New York, NY 10044 USA

[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

[3] Sungkyunkwan Univ, Suwon 16419, Gyeonggi, South Korea

[4] Lawrence Berkeley Natl Lab, Computat Res, Berkeley, CA 94720 USA

[5] Amazon, Networking Dept, Seattle, WA 98170 USA

[6] Kevala Inc, Data Sci & Methodol Implementat, San Francisco, CA 94133 USA

[7] Kevala Inc, Software Engn Dept, San Francisco, CA 94133 USA

[8] Kevala Inc, San Francisco, CA 94133 USA

来源：

IEEE TRANSACTIONS ON SMART GRID | 2022年 / 13卷 / 06期

关键词：

Differential privacy; clustering; smart grids; summary statistics; synthetic load generation; NOISE;

D O I：

10.1109/TSG.2022.3184252

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The proliferation of smart meters has resulted in a large amount of data being generated. It is increasingly apparent that methods are required for allowing a variety of stakeholders to leverage the data in a manner that preserves the privacy of the consumers. The sector is scrambling to define policies, such as the so called '15/15 rule', to respond to the need. However, the current policies fail to adequately guarantee privacy. In this paper, we address the problem of allowing third parties to apply K-means clustering, obtaining customer labels and centroids for a set of load time series by applying the framework of differential privacy. We leverage the method to design an algorithm that generates differentially private synthetic load data consistent with the labeled data. We test our algorithm's utility by answering summary statistics such as average daily load profiles for a 2-dimensional synthetic dataset and a real-world power load dataset.

引用

页码：4801 / 4814

页数：14

共 50 条

[31] k-Means Clustering of Lines for Big Data
Marom, Yair
Feldman, Dan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[32] K-Means Extensions for Clustering Categorical Data
Alwersh, Mohammed
Kovacs, Laszlo
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 492 - 507
[33] New k-Means data clustering approach
College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, China
不详
不详
J. Comput. Inf. Syst., 2008, 2 (565-570):
[34] K-means*: Clustering by gradual data transformation
Malinen, Mikko I.
Mariescu-Istodor, Radu
Franti, Pasi
PATTERN RECOGNITION, 2014, 47 (10) : 3376 - 3386
[35] Data decomposition for parallel K-means clustering
Gursoy, A
PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 241 - 248
[36] Data Analysis of Educational Evaluation Using K-Means Clustering Method
Liu, Rui
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[37] A Missing Data Complement Method Based on K-means Clustering Analysis
Shi, Pengjia
Zhang, Linyao
2017 IEEE CONFERENCE ON ENERGY INTERNET AND ENERGY SYSTEM INTEGRATION (EI2), 2017,
[38] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
Shi Na
Liu Xumin
Guan Yong
2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
[39] Differentially Private k-Means via Exponential Mechanism and Max Cover
Nguyen, Huy L.
Chaturvedi, Anamay
Xu, Eric Z.
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9101 - 9108
[40] Applied Comparison of DBSCAN, OPTICS and K-Means Clustering Algorithms
Bilgin, Turgay Tugay
Camurcu, Yilmaz
JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2005, 8 (02): : 139 - 145

← 1 2 3 4 5 →