Tuning Local Search by Average-Reward Reinforcement Learning

被引:5
|
作者
Prestwich, Steven [1 ]
机构
[1] Natl Univ Ireland Univ Coll Cork, Dept Comp Sci, Cork Constraint Computat Ctr, Cork, Ireland
来源
关键词
D O I
10.1007/978-3-540-92695-5_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning and local search have been combined in a variety of ways, in order to learn how to solve combinatorial problems more efficiently. Most approaches optimise the total reward, where the reward at each action is the change in objective function. We argue that it is more appropriate to optimise the average reward. We use R-learning to dynamically tune noise in standard SAT local search algorithms on single instances. Experiments show that noise can be successfully automated in this way.
引用
收藏
页码:192 / 205
页数:14
相关论文
共 50 条
  • [1] Robust Average-Reward Reinforcement Learning
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 719 - 803
  • [2] Robust Average-Reward Reinforcement Learning
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    [J]. Journal of Artificial Intelligence Research, 2024, 80 : 719 - 803
  • [3] Average-Reward Reinforcement Learning with Trust Region Methods
    Ma, Xiaoteng
    Tang, Xiaohang
    Xia, Li
    Yang, Jun
    Zhao, Qianchuan
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2797 - 2803
  • [4] Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
    Pagare, Tejas
    Borkar, Vivek
    Avrachenkov, Konstantin
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [5] On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
    Zhang, Yiming
    Ross, Keith W.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] An Average-Reward Reinforcement Learning Algorithm based on Schweitzer's Transformation
    Li Jianjun
    Ren Jiangong
    Li Yanjie
    [J]. PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 2966 - 2970
  • [7] Average-Reward Learning and Planning with Options
    Wan, Yi
    Naik, Abhishek
    Sutton, Richard S.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [9] An average-reward reinforcement learning algorithm for computing bias-optimal policies
    Mahadevan, S
    [J]. PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, 1996, : 875 - 880
  • [10] Scaling model-based average-reward reinforcement learning for product delivery
    Proper, Scott
    Tadepalli, Prasad
    [J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 735 - 742