Communication-Efficient k-Means for Edge-Based Machine Learning

被引：1

作者：

Lu, Hanlin ^{[1
]}

He, Ting ^{[1
]}

Wang, Shiqiang ^{[2
]}

Liu, Changchang ^{[2
]}

Mahdavi, Mehrdad ^{[1
]}

Narayanan, Vijaykrishnan ^{[1
]}

Chan, Kevin S. ^{[3
]}

Pasteris, Stephen ^{[4
]}

机构：

[1] Penn State Univ, University Pk, PA 16802 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA

[3] Army Res Lab, Adelphi, MD 20783 USA

[4] UCL, London WC1E 6EA, England

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2022年 / 33卷 / 10期

基金：

美国国家科学基金会;

关键词：

k-Means; dimensionality reduction; coreset; random projection; quantization; edge-based machine learning; JOHNSON-LINDENSTRAUSS; CORESETS;

D O I：

10.1109/TPDS.2022.3144595

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We consider the problem of computing the k-means centers for a large high-dimensional dataset in the context of edge-based machine learning, where data sources offload machine learning computation to nearby edge servers. k-Means computation is fundamental to many data analytics, and the capability of computing provably accurate k-means centers by leveraging the computation power of the edge servers, at a low communication and computation cost to the data sources, will greatly improve the performance of these analytics. We propose to let the data sources send small summaries, generated by joint dimensionality reduction (DR), cardinality reduction (CR), and quantization (QT), to support approximate k-means computation at reduced complexity and communication cost. By analyzing the complexity, the communication cost, and the approximation error of k-means algorithms based on carefully designed composition of DR/CR/QT methods, we show that: (i) it is possible to compute near-optimal k-means centers at a near-linear complexity and a constant or logarithmic communication cost, (ii) the order of applying DR and CR significantly affects the complexity and the communication cost, and (iii) combining DR/CR methods with a properly configured quantizer can further reduce the communication cost without compromising the other performance metrics. Our theoretical analysis has been validated through experiments based on real datasets.

引用

页码：2509 / 2523

页数：15

共 50 条

[21] Communication-Efficient Personalized Federated Edge Learning for Decentralized Sensing in ISAC
Zhu, Yonghui
Zhang, Ronghui
Cui, Yuanhao
Wu, Sheng
Jiang, Chunxiao
Jing, Xiaojun
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS, 2023, : 207 - 212
[22] Efficiency Analysis of Machine Learning Intelligent Investment Based on K-Means Algorithm
Li, Liang
Wang, Jia
Li, Xuetao
[J]. IEEE ACCESS, 2020, 8 : 147463 - 147470
[23] Communication-Efficient and Private Federated Learning with Adaptive Sparsity-Based Pruning on Edge Computing
Song, Shijin
Du, Sen
Song, Yuefeng
Zhu, Yongxin
[J]. ELECTRONICS, 2024, 13 (17)
[24] Learning the k in k-means
Hamerly, G
Elkan, C
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 281 - 288
[25] Communication-Efficient Federated Learning Based on Compressed Sensing
Li, Chengxi
Li, Gang
Varshney, Pramod K.
[J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) : 15531 - 15541
[26] Communication-Efficient Edge AI: Algorithms and Systems
Shi, Yuanming
Yang, Kai
Jiang, Tao
Zhang, Jun
Letaief, Khaled B.
[J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2020, 22 (04): : 2167 - 2191
[27] Communication-Efficient and Attack-Resistant Federated Edge Learning With Dataset Distillation
Zhou, Yanlin
Ma, Xiyao
Wu, Dapeng
Li, Xiaolin
[J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 2517 - 2528
[28] FedComm: Understanding Communication Protocols for Edge-based Federated Learning
Cleland, Gary
Wu, Di
Ullah, Rehmat
Varghese, Blesson
[J]. 2022 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING, UCC, 2022, : 71 - 81
[29] Eureka: Edge-Based Discovery of Training Data for Machine Learning
Feng, Ziqiang
George, Shilpa
Harkes, Jan
Klatzky, Roberta L.
Satyanarayanan, Mahadev
Pillai, Padmanabhan
[J]. IEEE INTERNET COMPUTING, 2019, 23 (04) : 35 - 42
[30] Adaptive Top-K in SGD for Communication-Efficient Distributed Learning
Ruan, Mengzhe
Yan, Guangfeng
Xiao, Yuanzhang
Song, Linqi
Xu, Weitao
[J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 5280 - 5285

← 1 2 3 4 5 →