Tuning Local Search by Average-Reward Reinforcement Learning

被引：5

作者：

Prestwich, Steven ^{[1
]}

机构：

[1] Natl Univ Ireland Univ Coll Cork, Dept Comp Sci, Cork Constraint Computat Ctr, Cork, Ireland

来源：

LEARNING AND INTELLIGENT OPTIMIZATION | 2008年 / 5313卷

关键词：

D O I：

10.1007/978-3-540-92695-5_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement Learning and local search have been combined in a variety of ways, in order to learn how to solve combinatorial problems more efficiently. Most approaches optimise the total reward, where the reward at each action is the change in objective function. We argue that it is more appropriate to optimise the average reward. We use R-learning to dynamically tune noise in standard SAT local search algorithms on single instances. Experiments show that noise can be successfully automated in this way.

引用

页码：192 / 205

页数：14

共 50 条

[1] Robust Average-Reward Reinforcement Learning
Wang, Yue
Velasquez, Alvaro
Atia, George
Prater-Bennette, Ashley
Zou, Shaofeng
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 719 - 803
[2] Robust Average-Reward Reinforcement Learning
Wang, Yue
Velasquez, Alvaro
Atia, George
Prater-Bennette, Ashley
Zou, Shaofeng
[J]. Journal of Artificial Intelligence Research, 2024, 80 : 719 - 803
[3] Average-Reward Reinforcement Learning with Trust Region Methods
Ma, Xiaoteng
Tang, Xiaohang
Xia, Li
Yang, Jun
Zhao, Qianchuan
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2797 - 2803
[4] Full Gradient Deep Reinforcement Learning for Average-Reward Criterion
Pagare, Tejas
Borkar, Vivek
Avrachenkov, Konstantin
[J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[5] On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
Zhang, Yiming
Ross, Keith W.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[6] An Average-Reward Reinforcement Learning Algorithm based on Schweitzer's Transformation
Li Jianjun
Ren Jiangong
Li Yanjie
[J]. PROCEEDINGS OF THE 31ST CHINESE CONTROL CONFERENCE, 2012, : 2966 - 2970
[7] Average-Reward Learning and Planning with Options
Wan, Yi
Naik, Abhishek
Sutton, Richard S.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[8] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
Duc Thien Nguyen
Yeoh, William
Lau, Hoong Chuin
Zilberstein, Shlomo
Zhang, Chongjie
[J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
[9] An average-reward reinforcement learning algorithm for computing bias-optimal policies
Mahadevan, S
[J]. PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, 1996, : 875 - 880
[10] Scaling model-based average-reward reinforcement learning for product delivery
Proper, Scott
Tadepalli, Prasad
[J]. MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 735 - 742

← 1 2 3 4 5 →