AntNet with Reward-Penalty Reinforcement Learning

被引：21

作者：

Lalbakhsh, Pooia ^{[1
]}

Zaeri, Bahram ^{[2
]}

Lalbakhsh, Ali ^{[3
]}

Fesharaki, Mehdi N. ^{[4
]}

机构：

[1] Islamic Azad Univ, Dept Comp Engn, Borujerd Branch, Borujerd, Lorestan, Iran

[2] Islamic Azad Univ Arak Branch, Young Res Club YRC, Arak, Iran

[3] Islamic Azad Univ Sci & Res Campus, Dept Telecommun Engn, Tehran, Iran

[4] Islamic Azad Univ Sci & Res Campus, Dept Comp Engn, Tehran, Iran

来源：

2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, COMMUNICATION SYSTEMS AND NETWORKS (CICSYN) | 2010年

关键词：

Ant colony optimization; AntNet; reward-penalty reinforcement learning; swarm intelligence;

D O I：

10.1109/CICSyN.2010.11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper deals with a modification in the learning phase of AntNet routing algorithm, which improves the system adaptability in the presence of undesirable events. Unlike most of the ACO algorithms which consider reward-inaction reinforcement learning, the proposed strategy considers both reward and penalty onto the action probabilities. As simulation results show, considering penalty in AntNet routing algorithm increases the exploration towards other possible and sometimes much optimal selections, which leads to a more adaptive strategy. The proposed algorithm also uses a self-monitoring solution called Occurrence-Detection, to sense traffic fluctuations and make decision about the level of undesirability of the current status. The proposed algorithm makes use of the two mentioned strategies to prepare a self-healing version of AntNet routing algorithm to face undesirable and unpredictable traffic conditions.

引用

页码：17 / 21

页数：5

共 50 条

[41] Can a Dynamic Reward-Penalty Mechanism Help the Implementation of Renewable Portfolio Standards under Information Asymmetry?
Xin, Xing
SYMMETRY-BASEL, 2020, 12 (04):
[42] Reward Reports for Reinforcement Learning
Gilbert, Thomas Krendl
Lambert, Nathan
Dean, Sarah
Zick, Tom
Snoswell, Aaron
Mehta, Soham
PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 84 - 130
[43] Closed-Loop Supply Chain Coordination under a Reward-Penalty and a Manufacturer's Subsidy Policy
Kim, Sungki
Shin, Nina
Park, Sangwook
SUSTAINABILITY, 2020, 12 (22) : 1 - 28
[44] Closed-loop supply chains under reward-penalty mechanism: Retailer collection and asymmetric information
Wang, Wenbin
Zhang, Yu
Li, Yuanyuan
Zhao, Xuejuan
Cheng, Mingbao
JOURNAL OF CLEANER PRODUCTION, 2017, 142 : 3938 - 3955
[45] Reward-Penalty vs. Deposit-Refund: Government Incentive Mechanisms for EV Battery Recycling
Hao, Hao
Xu, Wenxian
Wei, Fangfang
Wu, Chuanliang
Xu, Zhaoran
ENERGIES, 2022, 15 (19)
[46] The social-economic-environmental impacts of recycling retired EV batteries under reward-penalty mechanism
Tang, Yanyan
Zhang, Qi
Li, Yaoming
Li, Hailong
Pan, Xunzhang
Mclellan, Benjamin
APPLIED ENERGY, 2019, 251
[47] Reward, motivation, and reinforcement learning
Dayan, P
Balleine, BW
NEURON, 2002, 36 (02) : 285 - 298
[48] Pricing and Collecting Decision of a Closed-Loop Supply Chain Under Market Segmentation With Reward-Penalty Mechanism
Wang, Wenbin
Zhong, Luosheng
Quan, Shiyuan
Liu, Ye
IEEE ACCESS, 2021, 9 : 167252 - 167266
[49] Maximum Margin of Twin Sphere Model via Combined Smooth Reward-Penalty Loss Function with Lower Bound
Kang Q.
Zhou S.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (10): : 885 - 897
[50] The Reward-Penalty Mechanism in a Closed-Loop Supply Chain with Asymmetric Information of the Third-Party Collector
Wang, Wenbin
Lv, Jia
An, Ni
Guan, Jie
Quan, Shiyuan
MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021

← 1 2 3 4 5 →