Communication-Efficient k-Means for Edge-Based Machine Learning

被引:1
|
作者
Lu, Hanlin [1 ]
He, Ting [1 ]
Wang, Shiqiang [2 ]
Liu, Changchang [2 ]
Mahdavi, Mehrdad [1 ]
Narayanan, Vijaykrishnan [1 ]
Chan, Kevin S. [3 ]
Pasteris, Stephen [4 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[3] Army Res Lab, Adelphi, MD 20783 USA
[4] UCL, London WC1E 6EA, England
基金
美国国家科学基金会;
关键词
k-Means; dimensionality reduction; coreset; random projection; quantization; edge-based machine learning; JOHNSON-LINDENSTRAUSS; CORESETS;
D O I
10.1109/TPDS.2022.3144595
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of computing the k-means centers for a large high-dimensional dataset in the context of edge-based machine learning, where data sources offload machine learning computation to nearby edge servers. k-Means computation is fundamental to many data analytics, and the capability of computing provably accurate k-means centers by leveraging the computation power of the edge servers, at a low communication and computation cost to the data sources, will greatly improve the performance of these analytics. We propose to let the data sources send small summaries, generated by joint dimensionality reduction (DR), cardinality reduction (CR), and quantization (QT), to support approximate k-means computation at reduced complexity and communication cost. By analyzing the complexity, the communication cost, and the approximation error of k-means algorithms based on carefully designed composition of DR/CR/QT methods, we show that: (i) it is possible to compute near-optimal k-means centers at a near-linear complexity and a constant or logarithmic communication cost, (ii) the order of applying DR and CR significantly affects the complexity and the communication cost, and (iii) combining DR/CR methods with a properly configured quantizer can further reduce the communication cost without compromising the other performance metrics. Our theoretical analysis has been validated through experiments based on real datasets.
引用
收藏
页码:2509 / 2523
页数:15
相关论文
共 50 条
  • [1] Communication-efficient k-Means for Edge-based Machine Learning
    Lu, Hanlin
    He, Ting
    Wang, Shiqiang
    Liu, Changchang
    Mahdavi, Mehrdad
    Narayanan, Vijaykrishnan
    Chan, Kevin S.
    Pasteris, Stephen
    [J]. 2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 595 - 605
  • [2] A Flexible Framework for Communication-Efficient Machine Learning
    Khirirat, Sarit
    Magnusson, Sindri
    Aytekin, Arda
    Johansson, Mikael
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8101 - 8109
  • [3] Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning
    Liu, Dongzhu
    Zhu, Guangxu
    Zhang, Jun
    Huang, Kaibin
    [J]. IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2021, 7 (01) : 265 - 278
  • [4] Communication-efficient and Scalable Decentralized Federated Edge Learning
    Yapp, Austine Zong Han
    Koh, Hong Soo Nicholas
    Lai, Yan Ting
    Kang, Jiawen
    Li, Xuandi
    Ng, Jer Shyuan
    Jiang, Hongchao
    Lim, Wei Yang Bryan
    Xiong, Zehui
    Niyato, Dusit
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 5032 - 5035
  • [5] K-means based on Active Learning for Support Vector Machine
    Gan, Jie
    Li, Ang
    Lei, Qian-Lin
    Ren, Hao
    Yang, Yun
    [J]. 2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 727 - 731
  • [6] Communication-Efficient Quantum Algorithm for Distributed Machine Learning
    Tang, Hao
    Li, Boning
    Wang, Guoqing
    Xu, Haowei
    Li, Changhao
    Barr, Ariel
    Cappellaro, Paola
    Li, Ju
    [J]. PHYSICAL REVIEW LETTERS, 2023, 130 (15)
  • [7] Communication-Efficient Federated Learning for Wireless Edge Intelligence in IoT
    Mills, Jed
    Hu, Jia
    Min, Geyong
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (07) : 5986 - 5994
  • [8] Coded Federated Learning for Communication-Efficient Edge Computing: A Survey
    Zhang, Yiqian
    Gao, Tianli
    Li, Congduan
    Tan, Chee Wei
    [J]. IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2024, 5 : 4098 - 4124
  • [9] LGCM: A Communication-Efficient Scheme for Federated Learning in Edge Devices
    Saadat, Nafas Gul
    Thahir, Sameer Mohamed
    Kumar, Santhosh G.
    Jereesh, A. S.
    [J]. 2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [10] Tibetan Character Recognition Based on Machine Learning of K-means Algorithm
    Gong, Huiwen
    Xiang, Wei
    [J]. PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON COMPUTER MODELING, SIMULATION AND ALGORITHM (CMSA 2018), 2018, 151 : 340 - 342