Two-Stage Clustering for Federated Learning with Pseudo Mini-batch SGD Training on Non-IID Data

被引:0
|
作者
Weng, Jianqing [1 ]
Su, Songzhi [1 ]
Fan, Xiaoliang [1 ]
机构
[1] Xiamen Univ, Xiamen 361005, Peoples R China
关键词
Federated learning; Clustering; Non-IID data;
D O I
10.1007/978-981-19-4546-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical heterogeneity problem in federated learning is mainly caused by the skewness of the data distribution among clients. In this paper, we first discover a connection between the discrepancy of data distributions and their model divergence. Based on this insight, we introduce a K-center clustering method to build client groups by the similarity of their local updating parameters, which can effectively reduce the data distribution skewness. Secondly, this paper provides a theoretical proof that a more uniform data distribution of clients in training can reduce the growth of model divergence thereby improving the training performance on Non-IID environment. Therefore, we randomly divide the clients of each cluster in the first stage into multiple fine-grained clusters to flatten the original data distribution. Finally, to fully leverage the data in each fine-grained cluster for training, we proposed an intra-cluster training method named pseudo mini-batch SGD training. This method can conduct general mini-batch SGD training on each fine-grained cluster with data kept locally. With the two-stage clustering mechanism, the negative effect of Non-IID data can be steadily eliminated. Experiments on two federated learning benchmarks i.e. FEMNIST and CelebA, as well as a manually setting Non-IID dataset using CIFAR10 show that our proposed method significantly improves training efficiency on Non-IID data and outperforms several widely-used federated baselines.
引用
收藏
页码:29 / 43
页数:15
相关论文
共 50 条
  • [1] Dynamic Clustering Federated Learning for Non-IID Data
    Chen, Ming
    Wu, Jinze
    Yin, Yu
    Huang, Zhenya
    Liu, Qi
    Chen, Enhong
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT III, 2022, 13606 : 119 - 131
  • [2] Blockchain-Based Two-Stage Federated Learning With Non-IID Data in IoMT System
    Lian, Zhuotao
    Zeng, Qingkui
    Wang, Weizheng
    Gadekallu, Thippa Reddy
    Su, Chunhua
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (04): : 1701 - 1710
  • [3] FedCML: Federated Clustering Mutual Learning with non-IID Data
    Chen, Zekai
    Wang, Fuyi
    Yu, Shengxing
    Liu, Ximeng
    Zheng, Zhiwei
    [J]. EURO-PAR 2023: PARALLEL PROCESSING, 2023, 14100 : 623 - 636
  • [4] Hierarchical Federated Learning with Adaptive Clustering on Non-IID Data
    Tian, Yuqing
    Zhang, Zhaoyang
    Yang, Zhaohui
    Jin, Richeng
    [J]. 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 627 - 632
  • [5] Federated learning with hierarchical clustering of local updates to improve training on non-IID data
    Briggs, Christopher
    Fan, Zhong
    Andras, Peter
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [6] FedRFC: Federated Learning with Recursive Fuzzy Clustering for improved non-IID data training
    Deng, Yuxiao
    Wang, Anqi
    Zhang, Lei
    Lei, Ying
    Li, Beibei
    Li, Yizhou
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 835 - 843
  • [7] Why Batch Normalization Damage Federated Learning on Non-IID Data?
    Wang, Yanmeng
    Shi, Qingjiang
    Chang, Tsung-Hui
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 15
  • [8] FEDBS: Learning on Non-IID Data in Federated Learning using Batch Normalization
    Idrissi, Meryem Janati
    Berrada, Ismail
    Noubir, Guevara
    [J]. 2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 861 - 867
  • [9] Privacy-preserving clustering federated learning for non-IID data
    Luo, Guixun
    Chen, Naiyue
    He, Jiahuan
    Jin, Bingwei
    Zhang, Zhiyuan
    Li, Yidong
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 154 : 384 - 395
  • [10] Federated learning on non-IID data: A survey
    Zhu, Hangyu
    Xu, Jinjin
    Liu, Shiqing
    Jin, Yaochu
    [J]. NEUROCOMPUTING, 2021, 465 : 371 - 390