Byzantine-Resilient Decentralized Policy Evaluation With Linear Function Approximation

被引:11
|
作者
Wu, Zhaoxian [1 ,2 ,3 ]
Shen, Han [4 ]
Chen, Tianyi [4 ]
Ling, Qing [1 ,2 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Guangdong, Peoples R China
[3] Pazhou Lab, Guangzhou 510300, Guangdong, Peoples R China
[4] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12180 USA
基金
中国国家自然科学基金;
关键词
Signal processing algorithms; Convergence; Function approximation; Optimization; Approximation algorithms; Task analysis; Mathematical model; Policy evaluation; multi-agent reinforcement learning; temporal-difference learning; Byzantine attacks; DISTRIBUTED OPTIMIZATION; CONVERGENCE; ALGORITHMS;
D O I
10.1109/TSP.2021.3090952
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we consider the policy evaluation problem in reinforcement learning with agents on a decentralized and directed network. In order to evaluate the quality of a fixed policy in this decentralized setting, one option is for agents to run decentralized temporal-difference (TD) collaboratively. To account for the practical scenarios where the state and action spaces are large and malicious attacks emerge, we focus on the decentralized TD learning with linear function approximation in the presence of malicious agents (often termed as Byzantine agents). We propose a trimmed mean-based Byzantine-resilient decentralized TD algorithm to perform policy evaluation in this setting. We establish the finite-time convergence rate, as well as the asymptotic learning error that depends on the number of Byzantine agents. Numerical experiments corroborate the robustness of the proposed algorithm.
引用
收藏
页码:3839 / 3853
页数:15
相关论文
共 50 条
  • [41] Abstractions for devising Byzantine-resilient state machine replication
    Doudou, Assia
    Garbinato, Benoit
    Guerraoui, Rachid
    Proceedings of the IEEE Symposium on Reliable Distributed Systems, 2000, : 144 - 153
  • [42] Data Encoding Methods for Byzantine-Resilient Distributed Optimization
    Data, Deepesh
    Song, Linqi
    Diggavi, Suhas
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 2719 - 2723
  • [43] Data Encoding for Byzantine-Resilient Distributed Gradient Descent
    Data, Deepesh
    Song, Linqi
    Diggavi, Suhas
    2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2018, : 863 - 870
  • [44] A BYZANTINE-RESILIENT DUAL SUBGRADIENT METHOD FOR VERTICAL FEDERATED LEARNING
    Yuan, Kun
    Wu, Zhaoxian
    Ling, Qing
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4273 - 4277
  • [45] Synthesis of Self-Stabilising and Byzantine-Resilient Distributed Systems
    Bloem, Roderick
    Braud-Santoni, Nicolas
    Jacobs, Swen
    COMPUTER AIDED VERIFICATION, (CAV 2016), PT I, 2016, 9779 : 157 - 176
  • [46] The Generals' Scuttlebutt: Byzantine-Resilient Gossip Protocols<bold> </bold>
    Coretti, Sandro
    Kiayias, Aggelos
    Moore, Cristopher
    Russell, Alexander
    PROCEEDINGS OF THE 2022 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2022, 2022, : 595 - 608
  • [47] On Byzantine-Resilient High-Dimensional Stochastic Gradient Descent
    Data, Deepesh
    Diggavi, Suhas
    2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 2628 - 2633
  • [48] Byzantine-Resilient Controller Mapping and Remapping in Software Defined Networks
    Mohan, Purnima Murali
    Truong-Huu, Tram
    Gurusamy, Mohan
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (04): : 2714 - 2729
  • [49] Byzantine-Resilient Distributed Bandit Online Optimization in Dynamic Environments
    Wei, Mengli
    Yu, Wenwu
    Liu, Hongzhe
    Chen, Duxin
    IEEE Transactions on Industrial Cyber-Physical Systems, 2024, 2 : 154 - 165
  • [50] Efficient Byzantine-Resilient reliable multicast on a hybrid failure model
    Correia, M
    Lung, LC
    Neves, NF
    Veríssimo, P
    21ST IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 2 - 11