On the Generalization of Representations in Reinforcement Learning

被引:0
|
作者
Le Lan, Charline [1 ]
Tu, Stephen [2 ]
Oberman, Adam [3 ]
Agarwal, Rishabh [2 ]
Bellemare, Marc [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] Google Brain, Mountain View, CA USA
[3] McGill Univ, Montreal, PQ, Canada
关键词
ARRAY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reinforcement learning, state representations are used to tractably deal with large problem spaces. State representations serve both to approximate the value function with few parameters, but also to generalize to newly encountered states. Their features may be learned implicitly (as part of a neural network) or explicitly (for example, the successor representation of Dayan (1993)). While the approximation properties of representations are reasonably well-understood, a precise characterization of how and when these representations generalize is lacking. In this work, we address this gap and provide an informative bound on the generalization error arising from a specific state representation. This bound is based on the notion of effective dimension which measures the degree to which knowing the value at one state informs the value at other states. Our bound applies to any state representation and quantifies the natural tension between representations that generalize well and those that approximate well. We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension. [GRAPHICS] .
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Using Predictive Representations to Improve Generalization in Reinforcement Learning
    Rafols, Eddie J.
    Ring, Mark B.
    Sutton, Richard S.
    Tanner, Brian
    [J]. 19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 835 - 840
  • [2] Ensemble successor representations for task generalization in offline-to-online reinforcement learning
    Wang, Changhong
    Yu, Xudong
    Bai, Chenjia
    Zhang, Qiaosheng
    Wang, Zhen
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (07)
  • [3] Ensemble successor representations for task generalization in offline-to-online reinforcement learning
    Changhong WANG
    Xudong YU
    Chenjia BAI
    Qiaosheng ZHANG
    Zhen WANG
    [J]. Science China(Information Sciences)., 2024, 67 (07) - 255
  • [4] Ensemble successor representations for task generalization in offline-to-online reinforcement learning
    Changhong WANG
    Xudong YU
    Chenjia BAI
    Qiaosheng ZHANG
    Zhen WANG
    [J]. Science China(Information Sciences), 2024, (07) - 255
  • [5] Quantifying Generalization in Reinforcement Learning
    Cobbe, Karl
    Klimov, Oleg
    Hesse, Chris
    Kim, Taehoon
    Schulman, John
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] Learning Action Representations for Reinforcement Learning
    Chandak, Yash
    Theocharous, Georgios
    Kostas, James E.
    Jordan, Scott M.
    Thomas, Philip S.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Learning Dynamics and Generalization in Deep Reinforcement Learning
    Lyle, Clare
    Rowland, Mark
    Dabney, Will
    Kwiatkowksa, Marta
    Gal, Yarin
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Reinforcement Learning with Prototypical Representations
    Yarats, Denis
    Fergus, Rob
    Lazaric, Alessandro
    Pinto, Lerrel
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [9] Graph Representations for Reinforcement Learning
    Schab, Esteban
    Casanova, Carlos
    Piccoli, Fabiana
    [J]. JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2024, 24 (01): : 29 - 38
  • [10] On the Generalization Gap in Reparameterizable Reinforcement Learning
    Wang, Huan
    Zheng, Stephan
    Xiong, Caiming
    Socher, Richard
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97