Multi-Armed Bandits Learning for Optimal Decentralized Control of Electric Vehicle Charging

被引:0
|
作者
Zafar, Sharyal [1 ]
Feraud, Raphael [2 ]
Blavette, Anne [3 ]
Camilleri, Guy [4 ]
Ben Ahmed, Hamid [1 ]
机构
[1] Ecole Normale Super Rennes, SATIE Lab, Bruz, France
[2] Orange Labs, Orange, Lannion, France
[3] Ecole Normale Super Rennes, CNRS, SATIE Lab, Bruz, France
[4] Paul Sabatier Univ, IRIT Lab, Toulouse, France
关键词
Electric vehicles; Active distribution networks; Smart charging; Multi-agent reinforcement learning; Combinatorial multi-armed bandits;
D O I
10.1109/POWERTECH55446.2023.10202971
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Optimal control of new grid elements, such as electric vehicles, can ensure an efficient, and stable operation of distribution networks. Decentralization can result in scalability, higher reliability, and privacy (which may not be present in centralized or hierarchical control solutions). A decentralized multi-agent multi-armed combinatorial bandits system using Thompson Sampling is presented for smart charging of electric vehicles. The proposed system utilizes the concepts of bandits reinforcement learning to manage the uncertainties in the choice of other players' actions, and in the intermittent photovoltaic energy production. This proposed solution is fully decentralized, real-time, scalable, model-free, and fair. Its performance is evaluated through comparison with other charging strategies i.e., basic charging, and centralized optimization.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Decentralized Learning for Multi-player Multi-armed Bandits
    Kalathil, Dileep
    Nayyar, Naumaan
    Jain, Rahul
    [J]. 2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 3960 - 3965
  • [2] Decentralized Exploration in Multi-Armed Bandits
    Feraud, Raphael
    Alami, Reda
    Laroche, Romain
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [3] On Optimal Foraging and Multi-armed Bandits
    Srivastava, Vaibhav
    Reverdy, Paul
    Leonard, Naomi E.
    [J]. 2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 494 - 499
  • [4] Multi-player Multi-armed Bandits: Decentralized Learning with IID Rewards
    Kalathil, Dileep
    Nayyar, Naumaan
    Jain, Rahul
    [J]. 2012 50TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2012, : 853 - 860
  • [5] Active Learning in Multi-armed Bandits
    Antos, Andras
    Grover, Varun
    Szepesvari, Csaba
    [J]. ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 287 - +
  • [6] Multi-armed Bandits: Competing with Optimal Sequences
    Anava, Oren
    Karnin, Zohar
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [7] Optimal Algorithms for Multiplayer Multi-Armed Bandits
    Wang, Po-An
    Proutiere, Alexandre
    Ariu, Kaito
    Jedra, Yassir
    Russo, Alessio
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [8] Optimal Streaming Algorithms for Multi-Armed Bandits
    Jin, Tianyuan
    Huang, Keke
    Tang, Jing
    Xiao, Xiaokui
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [9] TRANSFER LEARNING FOR CONTEXTUAL MULTI-ARMED BANDITS
    Cai, Changxiao
    Cai, T. Tony
    Li, Hongzhe
    [J]. ANNALS OF STATISTICS, 2024, 52 (01): : 207 - 232
  • [10] Quantum Reinforcement Learning for Multi-Armed Bandits
    Liu, Yi-Pei
    Li, Kuo
    Cao, Xi
    Jia, Qing-Shan
    Wang, Xu
    [J]. 2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 5675 - 5680