ON INFORMATION ASYMMETRY IN ONLINE REINFORCEMENT LEARNING

被引:0
|
作者
Tampubolon, Ezra [1 ]
Ceribasi, Haris [1 ]
Boche, Holger [1 ,2 ]
机构
[1] Tech Univ Munich, Lehrstuhl Theoret Informat Tech, Munich, Germany
[2] Munich Ctr Quantum Sci & Technol MCQST, Munich, Germany
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Information Asymmetry; Q-learning; Markov Game; Reinforcement Learning; Resource Allocation; SECURITY;
D O I
10.1109/ICASSP39728.2021.9413968
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we study the system of two interacting non-cooperative Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which does not occur in an environment of general independent learners. Furthermore, we discuss the resulted post-learning policies, show that they are almost optimal in the underlying game sense, and provide numerical hints of almost welfare-optimal of the resulted policies.
引用
收藏
页码:4955 / 4959
页数:5
相关论文
共 50 条
  • [41] Reinforcement Learning for Online Industrial Process Control
    Govindhasamy, James J.
    McLoone, Sean F.
    Irwin, George W.
    French, John J.
    Doyle, Richard P.
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2005, 9 (01) : 23 - 30
  • [42] Tank War Using Online Reinforcement Learning
    Andersen, Kresten Toftgaard
    Zeng, Yifeng
    Christensen, Dennis Dahl
    Tran, Dung
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2, 2009, : 497 - 500
  • [43] ONLINE REINFORCEMENT LEARNING FOR MULTIMEDIA BUFFER CONTROL
    Mastronarde, Nicholas
    van der Schaar, Mihaela
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 1958 - 1961
  • [44] Evolving Neural Networks for Online Reinforcement Learning
    Metzen, Jan Hendrik
    Edgington, Mark
    Kassahun, Yohannes
    Kirchner, Frank
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN X, PROCEEDINGS, 2008, 5199 : 518 - +
  • [45] An online reinforcement learning approach for HVAC control
    Solinas, Francesco M.
    Macii, Alberto
    Patti, Edoardo
    Bottaccioli, Lorenzo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [46] Online reinforcement learning for adaptive interference coordination
    Alcaraz, Juan J.
    Ayala-Romero, Jose A.
    Vales-Alonso, Javier
    Losilla-Lopez, Fernando
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2020, 31 (10)
  • [47] Online Adaptation of Deep Architectures with Reinforcement Learning
    Ganegedara, Thushan
    Ott, Lionel
    Ramos, Fabio
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 577 - 585
  • [48] Incremental Stochastic Factorization for Online Reinforcement Learning
    Barreto, Andre M. S.
    Beirigo, Rafael L.
    Pineau, Joelle
    Precup, Doina
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1468 - 1475
  • [49] Online Learning with Side Information
    Xu, Xiao
    Vakili, Sattar
    Zhao, Qing
    Swami, Ananthram
    MILCOM 2017 - 2017 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2017, : 303 - 308
  • [50] Reinforcement Learning with Bounded Information Loss
    Peters, Jan
    Muelling, Katharina
    Seldin, Yevgeny
    Altun, Yasemin
    BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2010, 1305 : 365 - 372