In-Network Computation for Large-Scale Federated Learning Over Wireless Edge Networks

被引：11

作者：

Dinh, Thinh Quang ^{[1
]}

Nguyen, Diep N. ^{[1
]}

Hoang, Dinh Thai ^{[1
]}

Pham, Tran Vu ^{[2
]}

Dutkiewicz, Eryk ^{[1
]}

机构：

[1] Univ Technol Sydney, Sch Elect & Data Engn, Ultimo, NSW 2007, Australia

[2] Ho Chi Minh City Univ Technol HCMUT, VNU HCM, Ho Chi Minh City 70000, Vietnam

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2023年 / 22卷 / 10期

基金：

澳大利亚研究理事会;

关键词：

Computational modeling; Servers; Routing; Training; Network architecture; Machine learning; Stars; Mobile edge computing; federated learning; in-network computation; large-scale distributed learning;

D O I：

10.1109/TMC.2022.3190260

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Most conventional Federated Learning (FL) models are using a star network topology where all users aggregate their local models at a single server (e.g., a cloud server). That causes significant overhead in terms of both communications and computing at the server, delaying the training process, especially for large scale FL systems with straggling nodes. This article proposes a novel edge network architecture that enables decentralizing the model aggregation process at the server, thereby significantly reducing the training delay for the whole FL network. Specifically, we design a highly-effective in-network computation framework (INC) consisting of a user scheduling mechanism, an in-network aggregation process (INA) which is designed for both primal- and primal-dual methods in distributed machine learning problems, and a network routing algorithm with theoretical performance bounds. The in-network aggregation process, which is implemented at edge nodes and cloud node, can adapt two typical methods to allow edge networks to effectively solve the distributed machine learning problems. Under the proposed INA, we then formulate a joint routing and resource optimization problem, aiming to minimize the aggregation latency. The problem turns out to be NP-hard, and thus we propose a polynomial time routing algorithm which can achieve near optimal performance with a theoretical bound. Simulation results showed that the proposed algorithm can achieve more than 99% of the optimal solution and reduce the FL training latency, up to 5.6 times w.r.t other baselines. The proposed INC framework can not only help reduce the FL training latency but also significantly decrease cloud's traffic and computing overhead. By embedding the computing/aggregation tasks at the edge nodes and leveraging the multi-layer edge-network architecture, the INC framework can liberate FL from the star topology to enable large-scale FL.

引用

页码：5918 / 5932

页数：15

共 50 条

[1] Enabling Large-Scale Federated Learning over Wireless Edge Networks
Thinh Quang Dinh
Nguyen, Diep N.
Dinh Thai Hoang
Pham Tran Vu
Dutkiewicz, Eryk
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[2] Federated Learning Over Multihop Wireless Networks With In-Network Aggregation
Chen, Xianhao
Zhu, Guangyu
Deng, Yiqin
Fang, Yuguang
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (06) : 4622 - 4634
[3] In-Network Caching and Learning Optimization for Federated Learning in Mobile Edge Networks
Saputra, Yuris Mulya
Nguyen, Diep N.
Dinh Thai Hoang
Dutkiewicz, Eryk
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 1653 - 1658
[4] Decentralized Federated Learning on the Edge Over Wireless Mesh Networks
Salama, Abdelaziz
Stergioulis, Achilleas
Zaidi, Syed Ali Raza
McLernon, Des
IEEE ACCESS, 2023, 11 : 124709 - 124724
[5] Opportunistic In-Network Computation for Wireless Sensor Networks
Jeon, Sang-Woon
Jung, Bang Chul
2015 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2015, : 1856 - 1860
[6] Broadband Digital Over-the-Air Computation for Wireless Federated Edge Learning
You, Lizhao
Zhao, Xinbo
Cao, Rui
Shao, Yulin
Fu, Liqun
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 5212 - 5228
[7] Multidimensional Similarity In-network Query for Large-scale Sensor Networks
Liu, Xuejun
Zhou, Shuigeng
Bai, Guangwei
Zhu, Diwen
MDM: 2009 10TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT, 2009, : 305 - +
[8] Random Aggregate Beamforming for Over-the-Air Federated Learning in Large-Scale Networks
Xu, Chunmei
Zhang, Cheng
Huang, Yongming
Niyato, Dusit
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (21): : 34325 - 34336
[9] Resource Management and Fairness for Federated Learning over Wireless Edge Networks
Balakrishnan, Ravikumar
Akdeniz, Mustafa
Dhakal, Sagar
Himayat, Nageen
PROCEEDINGS OF THE 21ST IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC2020), 2020,
[10] Over-the-Air Computation of Large-Scale Nomographic Functions in MapReduce over the Edge Cloud Network
Han, Fei
Lau, Vincent K. N.
Gong, Yi
IEEE Internet of Things Journal, 2022, 9 (14): : 11843 - 11857

← 1 2 3 4 5 →