Two-Stage Clustering for Federated Learning with Pseudo Mini-batch SGD Training on Non-IID Data

被引:0
|
作者
Weng, Jianqing [1 ]
Su, Songzhi [1 ]
Fan, Xiaoliang [1 ]
机构
[1] Xiamen Univ, Xiamen 361005, Peoples R China
关键词
Federated learning; Clustering; Non-IID data;
D O I
10.1007/978-981-19-4546-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical heterogeneity problem in federated learning is mainly caused by the skewness of the data distribution among clients. In this paper, we first discover a connection between the discrepancy of data distributions and their model divergence. Based on this insight, we introduce a K-center clustering method to build client groups by the similarity of their local updating parameters, which can effectively reduce the data distribution skewness. Secondly, this paper provides a theoretical proof that a more uniform data distribution of clients in training can reduce the growth of model divergence thereby improving the training performance on Non-IID environment. Therefore, we randomly divide the clients of each cluster in the first stage into multiple fine-grained clusters to flatten the original data distribution. Finally, to fully leverage the data in each fine-grained cluster for training, we proposed an intra-cluster training method named pseudo mini-batch SGD training. This method can conduct general mini-batch SGD training on each fine-grained cluster with data kept locally. With the two-stage clustering mechanism, the negative effect of Non-IID data can be steadily eliminated. Experiments on two federated learning benchmarks i.e. FEMNIST and CelebA, as well as a manually setting Non-IID dataset using CIFAR10 show that our proposed method significantly improves training efficiency on Non-IID data and outperforms several widely-used federated baselines.
引用
下载
收藏
页码:29 / 43
页数:15
相关论文
共 50 条
  • [31] Personalized Federated Learning with Clustering: Non-IID Heart Rate Variability Data Application
    Yoo, Joo Hun
    Son, Ha Min
    Jeong, Hyejun
    Jang, Eun-Hye
    Kim, Ah Young
    Yu, Han Young
    Jeon, Hong Jin
    Chung, Tai-Myoung
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1046 - 1051
  • [32] Adaptive Client Clustering for Efficient Federated Learning over Non-IID and Imbalanced Data
    Gong B.
    Xing T.
    Liu Z.
    Xi W.
    Chen X.
    IEEE Transactions on Big Data, 2024, 10 (06): : 1051 - 1065
  • [33] Multi-Stage Federated Learning Mechanism with non-IID Data in Internet of Vehicles
    Tang, Xiaolan
    Liang, Yuting
    Chen, Wenlong
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (09): : 2170 - 2184
  • [34] Coalitional Federated Learning: Improving Communication and Training on Non-IID Data With Selfish Clients
    Arisdakessian, Sarhad
    Wahab, Omar Abdel
    Mourad, Azzam
    Otrok, Hadi
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (04) : 2462 - 2476
  • [35] A General Federated Learning Scheme with Blockchain on Non-IID Data
    Wu, Hao
    Zhao, Shengnan
    Zhao, Chuan
    Jing, Shan
    INFORMATION SECURITY AND CRYPTOLOGY, INSCRYPT 2023, PT I, 2024, 14526 : 126 - 140
  • [36] Accelerating Federated learning on non-IID data against stragglers
    Zhang, Yupeng
    Duan, Lingjie
    Cheung, Ngai-Man
    2022 IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON WORKSHOPS), 2022, : 43 - 48
  • [37] Inverse Distance Aggregation for Federated Learning with Non-IID Data
    Yeganeh, Yousef
    Farshad, Azade
    Navab, Nassir
    Albarqouni, Shadi
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 150 - 159
  • [38] Enhancing Federated Learning Robustness Through Clustering Non-IID Features
    Li, Yanli
    Sani, Abubakar Sadiq
    Yuan, Dong
    Bao, Wei
    COMPUTER VISION - ACCV 2022 WORKSHOPS, 2023, 13848 : 45 - 59
  • [39] FedProc: Prototypical contrastive federated learning on non-IID data
    Mu, Xutong
    Shen, Yulong
    Cheng, Ke
    Geng, Xueli
    Fu, Jiaxuan
    Zhang, Tao
    Zhang, Zhiwei
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 143 : 93 - 104
  • [40] Advanced Optimization Techniques for Federated Learning on Non-IID Data
    Efthymiadis, Filippos
    Karras, Aristeidis
    Karras, Christos
    Sioutas, Spyros
    Future Internet, 2024, 16 (10):