A perspective on off-policy evaluation in reinforcement learning

被引:0
|
作者
Lihong Li
机构
[1] Google Brain,
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
页码:911 / 912
页数:1
相关论文
共 50 条
  • [1] A perspective on off-policy evaluation in reinforcement learning
    Li, Lihong
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 911 - 912
  • [2] Reliable Off-Policy Evaluation for Reinforcement Learning
    Wang, Jie
    Gao, Rui
    Zha, Hongyuan
    [J]. OPERATIONS RESEARCH, 2024, 72 (02) : 699 - 716
  • [3] Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
    Zhang, Zeyu
    Su, Yi
    Yuan, Hui
    Wu, Yiran
    Balasubramanian, Rishab
    Wu, Qingyun
    Wang, Huazheng
    Wang, Mengdi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Research on Off-Policy Evaluation in Reinforcement Learning: A Survey
    Wang, Shuo-Ru
    Niu, Wen-Jia
    Tong, En-Dong
    Chen, Tong
    Li, He
    Tian, Yun-Zhe
    Liu, Ji-Qiang
    Han, Zhen
    Li, Yi-Dong
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (09): : 1926 - 1945
  • [5] Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
    Thomas, Philip S.
    Brunskill, Emma
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [6] Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
    Yin, Ming
    Wang, Yu-Xiang
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [7] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories
    Weiwei Wang
    Yuqiang Li
    Xianyi Wu
    [J]. Statistics and Computing, 2024, 34
  • [8] Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation
    Kallus, Nathan
    Uehara, Masatoshi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [9] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories
    Wang, Weiwei
    Li, Yuqiang
    Wu, Xianyi
    [J]. STATISTICS AND COMPUTING, 2024, 34 (01)
  • [10] Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
    Jiang, Nan
    Li, Lihong
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48