Accelerating Model-Free Reinforcement Learning With Imperfect Model Knowledge in Dynamic Spectrum Access

被引:14
|
作者
Li, Lianjun [1 ]
Liu, Lingjia [1 ]
Bai, Jianan [1 ]
Chang, Hao-Hsuan [1 ]
Chen, Hao [2 ]
Ashdown, Jonathan D. [3 ]
Zhang, Jianzhong [2 ]
Yi, Yang [1 ]
机构
[1] Virginia Tech, Elect & Comp Engn Dept, Blacksburg, VA 24061 USA
[2] Samsung Res Amer, Stand & Mobil Innovat Lab, Plano, TX 75023 USA
[3] Air Force Res Lab, Informat Directorate, Rome, NY 13441 USA
来源
IEEE INTERNET OF THINGS JOURNAL | 2020年 / 7卷 / 08期
基金
美国国家科学基金会;
关键词
Computational modeling; Learning (artificial intelligence); Sensors; Wireless communication; Acceleration; Complexity theory; Internet of Things; Dynamic spectrum access (DSA); imperfect model; reinforcement learning (RL); training acceleration; wireless communications systems; NETWORKS;
D O I
10.1109/JIOT.2020.2988268
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current studies that Our records indicate that Hao-Hsuan Chang is a Graduate Student Member of the IEEE. Please verify. Our records indicate that Jonathan D. Ashdown is a Member of the IEEE. Please verify. apply reinforcement learning (RL) to dynamic spectrum access (DSA) problems in wireless communications systems mainly focus on model-free RL (MFRL). However, in practice, MFRL requires a large number of samples to achieve good performance making it impractical in real-time applications such as DSA. Combining model-free and model-based RL can potentially reduce the sample complexity while achieving a similar level of performance as MFRL as long as the learned model is accurate enough. However, in a complex environment, the learned model is never perfect. In this article, we combine model-free and model-based RL, and introduce an algorithm that can work with an imperfectly learned model to accelerate the MFRL. Results show our algorithm achieves higher sample efficiency than the standard MFRL algorithm and the Dyna algorithm (a standard algorithm integrating model-based RL and MFRL) with much lower computation complexity than the Dyna algorithm. For the extreme case where the learned model is highly inaccurate, the Dyna algorithm performs even worse than the MFRL algorithm while our algorithm can still outperform the MFRL algorithm.
引用
收藏
页码:7517 / 7528
页数:12
相关论文
共 50 条
  • [21] Model-Free Reinforcement Learning with Continuous Action in Practice
    Degris, Thomas
    Pilarski, Patrick M.
    Sutton, Richard S.
    2012 AMERICAN CONTROL CONFERENCE (ACC), 2012, : 2177 - 2182
  • [22] MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR
    Sweafford, Jerry, Jr.
    Fahimi, Farbod
    MECHATRONIC SYSTEMS AND CONTROL, 2019, 47 (03): : 136 - 143
  • [23] Robotic Table Tennis with Model-Free Reinforcement Learning
    Gao, Wenbo
    Graesser, Laura
    Choromanski, Krzysztof
    Song, Xingyou
    Lazic, Nevena
    Sanketi, Pannag
    Sindhwani, Vikas
    Jaitly, Navdeep
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5556 - 5563
  • [24] Discrete-time dynamic graphical games: model-free reinforcement learning solution
    Abouheaf M.I.
    Lewis F.L.
    Mahmoud M.S.
    Mikulski D.G.
    Control theory technol., 1 (55-69): : 55 - 69
  • [25] Model-Free Control for Dynamic-Field Acoustic Manipulation Using Reinforcement Learning
    Latifi, Kourosh
    Kopitca, Artur
    Zhou, Quan
    IEEE ACCESS, 2020, 8 : 20597 - 20606
  • [26] Discrete-time dynamic graphical games:model-free reinforcement learning solution
    Mohammed I.ABOUHEAF
    Frank L.LEWIS
    Magdi S.MAHMOUD
    Dariusz G.MIKULSKI
    Control Theory and Technology, 2015, 13 (01) : 55 - 69
  • [27] Dynamic Tuning of PI-Controllers based on Model-free Reinforcement Learning Methods
    Brujeni, Lena Abbasi
    Lee, Jong Min
    Shah, Sirish L.
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 453 - 458
  • [28] Model-free learning control of neutralization processes using reinforcement learning
    Syafiie, S.
    Tadeo, F.
    Martinez, E.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (06) : 767 - 782
  • [29] Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning
    Langaa, Jeppe
    Sloth, Christoffer
    2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
  • [30] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
    Massi, Elisa
    Barthelemy, Jeanne
    Mailly, Juliane
    Dromnelle, Remi
    Canitrot, Julien
    Poniatowski, Esther
    Girard, Benoit
    Khamassi, Mehdi
    FRONTIERS IN NEUROROBOTICS, 2022, 16