Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection

被引:0
|
作者
Xu, Yang [1 ,2 ]
Wang, Lun [1 ,2 ]
Xu, Hongli [1 ,2 ]
Liu, Jianchun [1 ,2 ]
Wang, Zhiyuan [1 ,2 ]
Huang, Liusheng [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China
[2] Univ Sci & Technol China, Suzhou Inst Adv Res, Suzhou 215123, Jiangsu, Peoples R China
关键词
Edge computing; federated learning; semi-supervised learning; pseudo-labeling; BIG DATA;
D O I
10.1109/TMC.2023.3265010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning (FL) has been widely applied to collaboratively train deep learning (DL) models on massive end devices (i.e., clients). Due to the limited storage capacity and high labeling cost, the data on each client may be insufficient for model training. Conversely, in cloud datacenters, there exist large-scale unlabeled data, which are easy to collect from public access (e.g., social media). Herein, we propose the Ada-FedSemi system, which leverages both on-device labeled data and in-cloud unlabeled data to boost the performance of DL models. In each round, local models are aggregated to produce pseudo-labels for the unlabeled data, which are utilized to enhance the global model. Considering that the number of participating clients and the quality of pseudo-labels will have a significant impact on the training performance, we introduce a multi-armed bandit (MAB) based online algorithm to adaptively determine the participating fraction and confidence threshold. Besides, to alleviate the impact of stragglers, we assign local models of different depths for heterogeneous clients. Extensive experiments on benchmark models and datasets show that given the same resource budget, the model trained by Ada-FedSemi achieves 3%similar to 14.8% higher test accuracy than that of the baseline methods. When achieving the same test accuracy, Ada-FedSemi saves up to 48% training cost, compared with the baselines. Under the scenario with heterogeneous clients, the proposed HeteroAda-FedSemi can further speed up the training process by 1.5x1.3x similar to 1.5x.
引用
收藏
页码:2813 / 2831
页数:19
相关论文
共 50 条
  • [1] Overcoming Client Data Deficiency in Federated Learning by Exploiting Unlabeled Data on the Server
    Park, Jae-Min
    Jang, Won-Jun
    Oh, Tae-Hyun
    Lee, Si-Hyeon
    [J]. IEEE ACCESS, 2024, 12 : 130007 - 130021
  • [2] Personalized Federated Learning With Server-Side Information
    Song, Jaehun
    Oh, Min-Hwan
    Kim, Hyung-Sin
    [J]. IEEE ACCESS, 2022, 10 : 120245 - 120255
  • [3] Client-side versus server-side geographic data processing performance comparison: Data and code
    Kulawiak, Marcin
    [J]. DATA IN BRIEF, 2019, 26
  • [4] Enhancing Federated Learning with In-Cloud Unlabeled Data
    Wang, Lun
    Xu, Yang
    Xu, Hongli
    Liu, Jianchun
    Wang, Zhiyuan
    Huang, Liusheng
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 136 - 149
  • [5] On the validity of client-side vs server-side web log data analysis
    Yun, Gi Woong
    Ford, Jay
    Hawkins, Robert P.
    Pingree, Suzanne
    McTavish, Fiona
    Gustafson, David
    Berhe, Haile
    [J]. INTERNET RESEARCH, 2006, 16 (05) : 537 - 552
  • [6] Server-side parallel data reduction and analysis
    Wang, Daniel L.
    Zender, Charles S.
    Jenks, Stephen F.
    [J]. ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2007, 4459 : 744 - +
  • [7] Federated Learning with Positive and Unlabeled Data
    Lin, Xinyang
    Chen, Hanting
    Xu, Yixing
    Xu, Chao
    Gui, Xiaolin
    Deng, Yiping
    Wang, Yunhe
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] One-shot Federated Learning without server-side training
    Su, Shangchao
    Li, Bin
    Xue, Xiangyang
    [J]. NEURAL NETWORKS, 2023, 164 : 203 - 215
  • [9] An EMD-Based Adaptive Client Selection Algorithm for Federated Learning in Heterogeneous Data Scenarios
    Chen, Aiguo
    Fu, Yang
    Sha, Zexin
    Lu, Guoming
    [J]. FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [10] Byzantine-Robust Federated Learning via Server-Side Mixtue of Experts
    [J]. Li, Jing (lj@ustc.edu.cn), 1600, Springer Science and Business Media Deutschland GmbH (14326 LNAI):