Flexible Data Augmentation in Off-Policy Reinforcement Learning

被引：0

作者：

Rak, Alexandra ^{[1
]}

Skrynnik, Alexey ^{[1
,2
]}

Panov, Aleksandr I. ^{[1
,2
]}

机构：

[1] Moscow Inst Phys & Technol, Moscow, Russia

[2] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia

来源：

ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT I | 2021年 / 12854卷

关键词：

Reinforcement learning; Image augmentation; Rainbow; Regularization;

D O I：

10.1007/978-3-030-87986-0_20

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper explores an application of image augmentation in reinforcement learning tasks - a popular regularization technique in the computer vision area. The analysis is based on the model-free off-policy algorithms. As a regularization, we consider the augmentation of the frames that are sampled from the replay buffer of the model. Evaluated augmentation techniques are random changes in image contrast, random shifting, random cutting, and others. Research is done using the environments of the Atari games: Breakout, Space Invaders, Berzerk, Wizard of Wor, Demon Attack. Using augmentations allowed us to obtain results confirming the significant acceleration of the model's algorithm convergence. We also proposed an adaptive mechanism for selecting the type of augmentation depending on the type of task being performed by the agent.

引用

页码：224 / 235

页数：12

共 50 条

[41] Quasi-Stochastic Approximation and Off-Policy Reinforcement Learning
Bernstein, Andrey
Chen, Yue
Colombino, Marcello
Dall'Anese, Emiliano
Mehta, Prashant
Meyn, Sean
[J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 5244 - 5251
[42] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus
Zhang, Yan
Zavlanos, Michael M.
[J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
[43] Research on Experience Replay of Off-policy Deep Reinforcement Learning: A Review
Hu, Zi-Jian
Gao, Xiao-Guang
Wan, Kai-Fang
Zhang, Le-Tian
Wang, Qiang-Long
Neretin, Evgeny
[J]. Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (11): : 2237 - 2256
[44] Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning
Daley, Brett
White, Martha
Amato, Christopher
Machado, Marlos C.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[45] Model-free off-policy reinforcement learning in continuous environment
Wawrzynski, P
Pacut, A
[J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 1091 - 1096
[46] Re-attentive experience replay in off-policy reinforcement learning
Wei Wei
Da Wang
Lin Li
Jiye Liang
[J]. Machine Learning, 2024, 113 : 2327 - 2349
[47] VALUE-AWARE IMPORTANCE WEIGHTING FOR OFF-POLICY REINFORCEMENT LEARNING
De Asis, Kristopher
Graves, Eric
Sutton, Richard S.
[J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 745 - 763
[48] Re-attentive experience replay in off-policy reinforcement learning
Wei, Wei
Wang, Da
Li, Lin
Liang, Jiye
[J]. MACHINE LEARNING, 2024, 113 (05) : 2327 - 2349
[49] Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Gottesman, Omer
Futoma, Joseph
Liu, Yao
Parbhoo, Sonali
Celi, Leo Anthony
Brunskill, Emma
Doshi-Velez, Finale
[J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[50] Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Kallus, Nathan
Uehara, Masatoshi
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →