Gaussian Process Reinforcement Learning for Fast Opportunistic Spectrum Access

被引:7
|
作者
Yan, Zun [1 ]
Cheng, Peng [1 ,2 ]
Chen, Zhuo [3 ]
Li, Yonghui [1 ]
Vucetic, Branka [1 ]
机构
[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
[2] La Trobe Univ, Dept Comp Sci & Informat Technol, Melbourne, Vic 3086, Australia
[3] CSIRO DATA61, Marsfield, NSW 2122, Australia
基金
澳大利亚研究理事会;
关键词
Sensors; Correlation; Kernel; Gaussian processes; Learning (artificial intelligence); Radio frequency; Training; Opportunistic spectrum access; sensing policy; Gaussian process reinforcement learning (GPRL); machine learning; COGNITIVE RADIO NETWORKS; OPTIMALITY; DESIGN; BANDIT; MAC;
D O I
10.1109/TSP.2020.2986354
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Opportunistic spectrum access (OSA) is envisioned to support the spectrum demand of future-generation wireless networks. The majority of existing work assumed independent primary channels with the knowledge of network dynamics. However, the channels are usually correlated and network dynamics is unknown <italic>a-priori</italic>. This entails a great challenge on the sensing policy design for spectrum opportunity tracking, and the conventional partially observable Markov decision process (POMDP) formulation with model-based solutions are generally inapplicable. In this paper, we take a different approach, and formulate the sensing policy design as a time-series POMDP from a model-free perspective. To solve this time-series POMDP, we propose a novel Gaussian process reinforcement learning (GPRL) based solution. It achieves accurate channel selection and a fast learning rate. In essence, GP is embedded in RL as a Q-function approximator to efficiently utilize the past learning experience. A novel kernel function is first tailor designed to measure the correlation of time-series spectrum data. Then a covariance-based exploration strategy is developed to enable a proactive exploration for better policy learning. Finally, for GPRL to adapt to multichannel sensing, we propose a novel action-trimming method to reduce the computational cost. Our simulation results show that the designed sensing policy outperforms existing ones, and can obtain a near-optimal performance within a short learning phase.
引用
收藏
页码:2613 / 2628
页数:16
相关论文
共 50 条
  • [11] Approximately Optimal Adaptive Learning in Opportunistic Spectrum Access
    Tekin, Cem
    Liu, Mingyan
    [J]. 2012 PROCEEDINGS IEEE INFOCOM, 2012, : 1548 - 1556
  • [12] Decentralized Online Learning Algorithms for Opportunistic Spectrum Access
    Gai, Yi
    Krishnamachari, Bhaskar
    [J]. 2011 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE (GLOBECOM 2011), 2011,
  • [13] Fast Learning Cognitive Radios in Underlay Dynamic Spectrum Access: Integration of Transfer Learning into Deep Reinforcement Learning
    Shah-Mohammadi, Fatemeh
    Kwasinski, Andres
    [J]. 2020 WIRELESS TELECOMMUNICATIONS SYMPOSIUM (WTS), 2020,
  • [14] Gaussian process model based reinforcement learning
    Yoo, Jae Hyun
    [J]. Journal of Institute of Control, Robotics and Systems, 2019, 25 (08) : 746 - 751
  • [15] Fast Reinforcement Learning with Incremental Gaussian Mixture Models
    Pinto, Rafael
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [16] Opportunistic Spectrum Access with Multiple Users: Learning under Competition
    Anandkumar, Animashree
    Michael, Nithin
    Tang, Ao
    [J]. 2010 PROCEEDINGS IEEE INFOCOM, 2010,
  • [17] Enabling opportunistic and dynamic spectrum access through learning techniques
    Alsaleh, Omar
    Venkatraman, Pavithra
    Hamdaoui, Bechir
    Fern, Alan
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2011, 11 (12): : 1497 - 1506
  • [18] Opportunistic Channel Access Using Reinforcement Learning in Tiered CBRS Networks
    Tonnemacher, Matthew
    Tarver, Chance
    Chandrasekhar, Vikram
    Chen, Hao
    Huang, Pengda
    Ng, Boon Loong
    Zhang, Jianzhong
    Cavallaro, Joseph R.
    Camp, Joseph
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON DYNAMIC SPECTRUM ACCESS NETWORKS (DYSPAN), 2018,
  • [19] Distributed Stochastic Online Learning Policies for Opportunistic Spectrum Access
    Gai, Yi
    Krishnamachari, Bhaskar
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2014, 62 (23) : 6184 - 6193
  • [20] Online Learning in Opportunistic Spectrum Access: A Restless Bandit Approach
    Tekin, Cem
    Liu, Mingyan
    [J]. 2011 PROCEEDINGS IEEE INFOCOM, 2011, : 2462 - 2470