Communication-Efficient k-Means for Edge-Based Machine Learning

被引:1
|
作者
Lu, Hanlin [1 ]
He, Ting [1 ]
Wang, Shiqiang [2 ]
Liu, Changchang [2 ]
Mahdavi, Mehrdad [1 ]
Narayanan, Vijaykrishnan [1 ]
Chan, Kevin S. [3 ]
Pasteris, Stephen [4 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[3] Army Res Lab, Adelphi, MD 20783 USA
[4] UCL, London WC1E 6EA, England
基金
美国国家科学基金会;
关键词
k-Means; dimensionality reduction; coreset; random projection; quantization; edge-based machine learning; JOHNSON-LINDENSTRAUSS; CORESETS;
D O I
10.1109/TPDS.2022.3144595
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of computing the k-means centers for a large high-dimensional dataset in the context of edge-based machine learning, where data sources offload machine learning computation to nearby edge servers. k-Means computation is fundamental to many data analytics, and the capability of computing provably accurate k-means centers by leveraging the computation power of the edge servers, at a low communication and computation cost to the data sources, will greatly improve the performance of these analytics. We propose to let the data sources send small summaries, generated by joint dimensionality reduction (DR), cardinality reduction (CR), and quantization (QT), to support approximate k-means computation at reduced complexity and communication cost. By analyzing the complexity, the communication cost, and the approximation error of k-means algorithms based on carefully designed composition of DR/CR/QT methods, we show that: (i) it is possible to compute near-optimal k-means centers at a near-linear complexity and a constant or logarithmic communication cost, (ii) the order of applying DR and CR significantly affects the complexity and the communication cost, and (iii) combining DR/CR methods with a properly configured quantizer can further reduce the communication cost without compromising the other performance metrics. Our theoretical analysis has been validated through experiments based on real datasets.
引用
收藏
页码:2509 / 2523
页数:15
相关论文
共 50 条
  • [21] Communication-Efficient Personalized Federated Edge Learning for Decentralized Sensing in ISAC
    Zhu, Yonghui
    Zhang, Ronghui
    Cui, Yuanhao
    Wu, Sheng
    Jiang, Chunxiao
    Jing, Xiaojun
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS, 2023, : 207 - 212
  • [22] Efficiency Analysis of Machine Learning Intelligent Investment Based on K-Means Algorithm
    Li, Liang
    Wang, Jia
    Li, Xuetao
    [J]. IEEE ACCESS, 2020, 8 : 147463 - 147470
  • [23] Communication-Efficient and Private Federated Learning with Adaptive Sparsity-Based Pruning on Edge Computing
    Song, Shijin
    Du, Sen
    Song, Yuefeng
    Zhu, Yongxin
    [J]. ELECTRONICS, 2024, 13 (17)
  • [24] Learning the k in k-means
    Hamerly, G
    Elkan, C
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 281 - 288
  • [25] Communication-Efficient Federated Learning Based on Compressed Sensing
    Li, Chengxi
    Li, Gang
    Varshney, Pramod K.
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (20) : 15531 - 15541
  • [26] Communication-Efficient Edge AI: Algorithms and Systems
    Shi, Yuanming
    Yang, Kai
    Jiang, Tao
    Zhang, Jun
    Letaief, Khaled B.
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2020, 22 (04): : 2167 - 2191
  • [27] Communication-Efficient and Attack-Resistant Federated Edge Learning With Dataset Distillation
    Zhou, Yanlin
    Ma, Xiyao
    Wu, Dapeng
    Li, Xiaolin
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 2517 - 2528
  • [28] FedComm: Understanding Communication Protocols for Edge-based Federated Learning
    Cleland, Gary
    Wu, Di
    Ullah, Rehmat
    Varghese, Blesson
    [J]. 2022 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING, UCC, 2022, : 71 - 81
  • [29] Eureka: Edge-Based Discovery of Training Data for Machine Learning
    Feng, Ziqiang
    George, Shilpa
    Harkes, Jan
    Klatzky, Roberta L.
    Satyanarayanan, Mahadev
    Pillai, Padmanabhan
    [J]. IEEE INTERNET COMPUTING, 2019, 23 (04) : 35 - 42
  • [30] Adaptive Top-K in SGD for Communication-Efficient Distributed Learning
    Ruan, Mengzhe
    Yan, Guangfeng
    Xiao, Yuanzhang
    Song, Linqi
    Xu, Weitao
    [J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 5280 - 5285