Byzantine-Resilient Decentralized Policy Evaluation With Linear Function Approximation

被引：11

作者：

Wu, Zhaoxian ^{[1
,2
,3
]}

Shen, Han ^{[4
]}

Chen, Tianyi ^{[4
]}

Ling, Qing ^{[1
,2
,3
]}

机构：

[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China

[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Guangdong, Peoples R China

[3] Pazhou Lab, Guangzhou 510300, Guangdong, Peoples R China

[4] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12180 USA

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2021年 / 69卷

基金：

中国国家自然科学基金;

关键词：

Signal processing algorithms; Convergence; Function approximation; Optimization; Approximation algorithms; Task analysis; Mathematical model; Policy evaluation; multi-agent reinforcement learning; temporal-difference learning; Byzantine attacks; DISTRIBUTED OPTIMIZATION; CONVERGENCE; ALGORITHMS;

D O I：

10.1109/TSP.2021.3090952

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we consider the policy evaluation problem in reinforcement learning with agents on a decentralized and directed network. In order to evaluate the quality of a fixed policy in this decentralized setting, one option is for agents to run decentralized temporal-difference (TD) collaboratively. To account for the practical scenarios where the state and action spaces are large and malicious attacks emerge, we focus on the decentralized TD learning with linear function approximation in the presence of malicious agents (often termed as Byzantine agents). We propose a trimmed mean-based Byzantine-resilient decentralized TD algorithm to perform policy evaluation in this setting. We establish the finite-time convergence rate, as well as the asymptotic learning error that depends on the number of Byzantine agents. Numerical experiments corroborate the robustness of the proposed algorithm.

引用

页码：3839 / 3853

页数：15

共 50 条

[21] BYZANTINE-RESILIENT DISTRIBUTED COMPUTING SYSTEMS
PATNAIK, LM
BALAJI, S
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 1987, 11 : 81 - 91
[22] Byzantine-Resilient Secure Federated Learning
So, Jinhyun
Guler, Basak
Avestimehr, A. Salman
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (07) : 2168 - 2181
[23] Byzantine-resilient distributed observers for LTI systems
Mitra, Aritra
Sundaram, Shreyas
AUTOMATICA, 2019, 108
[24] BYZANTINE-RESILIENT DISTRIBUTED COMPUTING SYSTEMS.
Patnaik, L.M.
Balaji, S.
Sadhana - Academy Proceedings in Engineering Sciences, 1987, 11 (1-2) : 81 - 91
[25] Byzantine-Resilient Convergence in Oblivious Robot Networks
Bouzid, Zohir
Potop-Butucaru, Maria Gradinariu
Tixeuil, Sebastien
DISTRIBUTED COMPUTING AND NETWORKING, 2009, 5408 : 275 - 280
[26] Data Encoding for Byzantine-Resilient Distributed Optimization
Data, Deepesh
Song, Linqi
Diggavi, Suhas N.
IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (02) : 1117 - 1140
[27] Byzantine-resilient distributed learning under constraints
Ding, Dongsheng
Wei, Xiaohan
Yu, Hao
Jovanovic, Mihailo R.
2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 2260 - 2265
[28] SIoTFog: Byzantine-resilient IoT fog networking
Jian-wen Xu
Kaoru Ota
Mian-xiong Dong
An-feng Liu
Qiang Li
Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 1546 - 1557
[29] BYRDIE: A BYZANTINE-RESILIENT DISTRIBUTED LEARNING ALGORITHM
Yang, Zhixiong
Bajwa, Waheed U.
2018 IEEE DATA SCIENCE WORKSHOP (DSW), 2018, : 21 - 25
[30] Asynchronous Byzantine-Resilient Distributed Optimization with Momentum
Wan, Yi
Qu, Yifei
Zhao, Zuyan
Yang, Shaofu
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 2022 - 2027

← 1 2 3 4 5 →