Enhancing visual reinforcement learning with State-Action Representation

被引:0
|
作者
Yan, Mengbei [1 ]
Lyu, Jiafei [1 ]
Li, Xiu [1 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Lishui Rd, Shenzhen 518055, Peoples R China
关键词
Visual reinforcement learning; State-action representation; Sample efficiency;
D O I
10.1016/j.knosys.2024.112487
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the remarkable progress made in visual reinforcement learning (RL) in recent years, sample inefficiency remains a major challenge. Many existing approaches attempt to address this by extracting better representations from raw images using techniques like data augmentation or introducing some auxiliary tasks. However, these methods overlook the environmental dynamic information embedded in the collected transitions, which can be crucial for efficient control. In this paper, we present STAR: State-Action Action Representation Learning, a simple yet effective approach for visual continuous control. STAR learns a joint state-action representation by modeling the dynamics of the environment in the latent space. By incorporating the learned joint state- action representation into the critic, STAR enhances the value estimation with latent dynamics information. We theoretically show that the value function can still converge to the optima when involving additional representation inputs. On various challenging visual continuous control tasks from DeepMind Control Suite, STAR achieves significant improvements in sample efficiency compared to strong baseline algorithms.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] VCSAP: Online reinforcement learning exploration method based on visitation count of state-action pairs
    Zhou, Ruikai
    Zhu, Wenbo
    Han, Shuai
    Kang, Meng
    Lu, Shuai
    NEURAL NETWORKS, 2025, 184
  • [22] THE STATE-ACTION PROBLEM
    FREUND, PA
    PROCEEDINGS OF THE AMERICAN PHILOSOPHICAL SOCIETY, 1991, 135 (01) : 3 - 12
  • [23] Reinforcement learning in dynamic environment: abstraction of state-action space utilizing properties of the robot body and environment
    Ito, Kazuyuki
    Takeuchi, Yutaka
    ARTIFICIAL LIFE AND ROBOTICS, 2016, 21 (01) : 11 - 17
  • [24] Online Reinforcement Learning Control of Nonlinear Dynamic Systems: A State-action Value Function Based Solution
    Asl, Hamed Jabbari
    Uchibe, Eiji
    NEUROCOMPUTING, 2023, 544
  • [25] R-learning with multiple state-action value tables
    School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1, Asahi-dai, Nomi 923-1292, Japan
    不详
    不详
    不详
    IEEJ Trans. Electron. Inf. Syst., 2006, 1 (72-82):
  • [26] R-learning with multiple state-action value tables
    Ishikawa, Koichiro
    Sakurai, Akito
    Fujinami, Tsutomu
    Kunifuji, Susumu
    ELECTRICAL ENGINEERING IN JAPAN, 2007, 159 (03) : 34 - 47
  • [27] Reinforcement learning in dynamic environment -Abstraction of state-action space utilizing properties of the robot body and environment-
    Takeuchi, Yutaka
    Ito, Kazuyuki
    PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 17TH '12), 2012, : 938 - 942
  • [28] R-learning with multiple state-action value tables
    Japan Advanced Institute of Science and Technology, Japan
    不详
    Electrical Engineering in Japan (English translation of Denki Gakkai Ronbunshi), 2007, 159 (03): : 34 - 47
  • [29] How Should Learning Classifier Systems Cover A State-Action Space?
    Nakata, Masaya
    Lanzi, Pier Luca
    Kovacs, Tim
    Browne, Will Neil
    Takadama, Keiki
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 3012 - 3019
  • [30] Environment Agnostic Representation for Visual Reinforcement learning
    Choi, Hyesong
    Lee, Hunsang
    Jeong, Seongwon
    Min, Dongbo
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 263 - 273