Flexible Data Augmentation in Off-Policy Reinforcement Learning

被引:0
|
作者
Rak, Alexandra [1 ]
Skrynnik, Alexey [1 ,2 ]
Panov, Aleksandr I. [1 ,2 ]
机构
[1] Moscow Inst Phys & Technol, Moscow, Russia
[2] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia
关键词
Reinforcement learning; Image augmentation; Rainbow; Regularization;
D O I
10.1007/978-3-030-87986-0_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper explores an application of image augmentation in reinforcement learning tasks - a popular regularization technique in the computer vision area. The analysis is based on the model-free off-policy algorithms. As a regularization, we consider the augmentation of the frames that are sampled from the replay buffer of the model. Evaluated augmentation techniques are random changes in image contrast, random shifting, random cutting, and others. Research is done using the environments of the Atari games: Breakout, Space Invaders, Berzerk, Wizard of Wor, Demon Attack. Using augmentations allowed us to obtain results confirming the significant acceleration of the model's algorithm convergence. We also proposed an adaptive mechanism for selecting the type of augmentation depending on the type of task being performed by the agent.
引用
收藏
页码:224 / 235
页数:12
相关论文
共 50 条
  • [41] Quasi-Stochastic Approximation and Off-Policy Reinforcement Learning
    Bernstein, Andrey
    Chen, Yue
    Colombino, Marcello
    Dall'Anese, Emiliano
    Mehta, Prashant
    Meyn, Sean
    [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 5244 - 5251
  • [42] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus
    Zhang, Yan
    Zavlanos, Michael M.
    [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
  • [43] Research on Experience Replay of Off-policy Deep Reinforcement Learning: A Review
    Hu, Zi-Jian
    Gao, Xiao-Guang
    Wan, Kai-Fang
    Zhang, Le-Tian
    Wang, Qiang-Long
    Neretin, Evgeny
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (11): : 2237 - 2256
  • [44] Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
    Daley, Brett
    White, Martha
    Amato, Christopher
    Machado, Marlos C.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [45] Model-free off-policy reinforcement learning in continuous environment
    Wawrzynski, P
    Pacut, A
    [J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 1091 - 1096
  • [46] Re-attentive experience replay in off-policy reinforcement learning
    Wei Wei
    Da Wang
    Lin Li
    Jiye Liang
    [J]. Machine Learning, 2024, 113 : 2327 - 2349
  • [47] VALUE-AWARE IMPORTANCE WEIGHTING FOR OFF-POLICY REINFORCEMENT LEARNING
    De Asis, Kristopher
    Graves, Eric
    Sutton, Richard S.
    [J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 745 - 763
  • [48] Re-attentive experience replay in off-policy reinforcement learning
    Wei, Wei
    Wang, Da
    Li, Lin
    Liang, Jiye
    [J]. MACHINE LEARNING, 2024, 113 (05) : 2327 - 2349
  • [49] Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
    Gottesman, Omer
    Futoma, Joseph
    Liu, Yao
    Parbhoo, Sonali
    Celi, Leo Anthony
    Brunskill, Emma
    Doshi-Velez, Finale
    [J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [50] Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
    Kallus, Nathan
    Uehara, Masatoshi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32