On the source-to-target gap of robust double deep Q-learning in digital twin-enabled wireless networks

被引:2
|
作者
McManus, Maxwell [1 ]
Guan, Zhangyu [1 ]
Mastronarde, Nicholas [1 ]
Zou, Shaofeng [1 ]
机构
[1] Univ Buffalo, Dept Elect Engn, Buffalo, NY 14260 USA
关键词
Zero-touch Networks; Digital Twin; Reinforcement Learning; Domain Adaptation; Source-to-Target Gap; SIMULATION;
D O I
10.1117/12.2618612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Digital twin has been envisioned as a key tool to enable data-driven real-time monitoring and prediction, automated modeling as well as zero-touch control and optimization in next-generation wireless networks. However, because of the mismatch between the dynamics in the source domain (i.e., the digital twin) and the target domain (i.e., the real network), policies generated in source domain by traditional machine learning algorithms may suffer from significant performance degradation when applied in the target domain, i.e., the so-called "source-to-target (S2T) gap" problem. In this work we investigate experimentally the S2T gap in digital twin-enabled wireless networks considering a new class of reinforcement learning algorithms referred to as robust deep reinforcement learning. We first design, based on a combination of double deep Q-learning and an R-contamination model, a robust learning framework to control the policy robustness through adversarial dynamics expected in the target domain. Then we test the robustness of the learning framework over UBSim, an event-driven universal simulator for broadband mobile wireless networks. The source domain is first constructed over UBSim by creating a virtual representation of an indoor testing environment at University at Buffalo, and then the target domain is constructed by modifying the source domain in terms of blockage distribution, user locations, among others. We compare the robust learning algorithm with traditional reinforcement learning algorithms in the presence of controlled model mismatch between the source and target domains. Through experiments we demonstrate that, with proper selection of parameter R, robust learning algorithms can reduce significantly the S2T gap, while they can be either too conservative or explorative otherwise. We observe that robust policy transfer is effective especially for target domains with time-varying blockage dynamics.
引用
收藏
页数:12
相关论文
共 43 条
  • [21] Digital Twin Enabled Q-Learning for Flying Base Station Placement: Impact of Varying Environment and Model Errors
    Guo, Terry N.
    2023 IEEE 26TH INTERNATIONAL SYMPOSIUM ON REAL-TIME DISTRIBUTED COMPUTING, ISORC, 2023, : 236 - 241
  • [22] A Distributed Double Deep Q-Learning Method for Object Redundancy Mitigation in Vehicular Networks
    Ghnaya, Imed
    Aniss, Hasnaa
    Ahmed, Toufik
    Mosbah, Mohamed
    2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,
  • [23] Multi-Agent Double Deep Q-Learning for Beamforming in mmWave MIMO Networks
    Wang, Xueyuan
    Gursoy, M. Cenk
    2020 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (IEEE PIMRC), 2020,
  • [24] Federated Double Deep Q-learning for Joint Delay and Energy Minimization in IoT networks
    Zarandi, Sheyda
    Tabassum, Hina
    2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
  • [25] Agent Decision Processes Using Double Deep Q-Networks plus Minimax Q-Learning
    Fitch, Natalie
    Clancy, Daniel
    2021 IEEE AEROSPACE CONFERENCE (AEROCONF 2021), 2021,
  • [26] Digital twin-enabled post-disaster damage and recovery monitoring with deep learning: leveraging transfer learning, attention mechanisms, and explainable AI
    Lagap, Umut
    Ghaffarian, Saman
    GEOMATICS NATURAL HAZARDS & RISK, 2025, 16 (01)
  • [27] QoS-Aware Load Balancing in Wireless Networks using Clipped Double Q-Learning
    Iturria-Rivera, Pedro Enrique
    Erol-Kantarci, Melike
    2021 IEEE 18TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2021), 2021, : 10 - 16
  • [28] DeepChunk: Deep Q-Learning for Chunk-Based Caching in Wireless Data Processing Networks
    Wang, Yimeng
    Li, Yongbo
    Lan, Tian
    Aggarwal, Vaneet
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2019, 5 (04) : 1034 - 1045
  • [29] Robust Q-learning for Fast And Optimal Flying Base Station Placement Aided By Digital Twin For Emergency Use
    Guo, Terry N.
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [30] Graph Attention Networks and Deep Q-Learning for Service Mesh Optimization: A Digital Twinning Approach
    Khan, Michel Gokan
    Taheri, Javid
    Kassler, Andreas
    Asl, Arsineh Boodaghian
    ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 2913 - 2918