Accelerated multi-objective task learning using modified Q-learning algorithm

被引：0

作者：

Rajamohan, Varun Prakash ^{[1
]}

Jagatheesaperumal, Senthil Kumar ^{[1
]}

机构：

[1] Mepco Schlenk Engn Coll, Dept Elect & Commun Engn, Sivakasi, Tamil Nadu, India

来源：

INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING | 2024年 / 47卷 / 01期

关键词：

reinforcement learning; Q-learning; robotic manipulator; task learning; distance metric;

D O I：

10.1504/IJAHUC.2024.140665

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robots find extensive applications in industry. In recent years, the influence of robots has also increased rapidly in domestic scenarios. The Q-learning algorithm aims to maximise the reward for reaching the goal. This paper proposes a modified version of the Q-learning algorithm, known as Q-learning with scaled distance metric (Q - SD). This algorithm enhances task learning and makes task completion more meaningful. A robotic manipulator (agent) applies the Q - SD algorithm to the task of table cleaning. Using Q - SD, the agent acquires the sequence of steps necessary to accomplish the task while minimising the manipulator's movement distance. We partition the table into grids of different dimensions. The first has a grid count of 3 x 3, and the second has a grid count of 4 x 4. Using the Q - SD algorithm, the maximum success obtained in these two environments was 86% and 59% respectively. Moreover, compared to the conventional Q-learning algorithm, the drop in average distance moved by the agent in these two environments using the Q - SD algorithm was 8.61% and 6.7% respectively.

引用

页码：28 / 37

页数：10

共 50 条

[21] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
Wang, Yin-Hao
Li, Tzuu-Hseng S.
Lin, Chih-Jui
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
[22] Heuristically accelerated Q-learning algorithm based on Laplacian Eigenmap
Zhu, Mei-Qiang
Li, Ming
Cheng, Yu-Hu
Zhang, Qian
Wang, Xue-Song
Kongzhi yu Juece/Control and Decision, 2014, 29 (03): : 425 - 430
[23] A Dual-Population Genetic Algorithm with Q-Learning for Multi-Objective Distributed Hybrid Flow Shop Scheduling Problem
Zhang, Jidong
Cai, Jingcao
SYMMETRY-BASEL, 2023, 15 (04):
[24] Bidirectional Q-Learning based Multi-objective optimization Routing Protocol for Multi-Destination FANETs
Xue, Liang
Tang, Jie
Zhang, Jiaying
Hu, Juncheng
2024 IEEE INTERNATIONAL WORKSHOP ON RADIO FREQUENCY AND ANTENNA TECHNOLOGIES, IWRF&AT 2024, 2024, : 421 - 426
[25] Assess team Q-learning algorithm in a purely cooperative multi-robot task
Wang, Ying
De Silva, Clarence W.
PROCEEDINGS OF THE ASME INTERNATIONAL MECHANICAL ENGINERING CONGRESS AND EXPOSITION 2007, VOL 9, PTS A-C: MECHANICAL SYSTEMS AND CONTROL, 2008, : 627 - 633
[26] A task distribution based Q-learning algorithm for multi- agent team coordination
Sun, Qiao, 1600, Transport and Telecommunication Institute, Lomonosova street 1, Riga, LV-1019, Latvia (18):
[27] LEARNING MULTI-OBJECTIVE DECEPTION IN A TWO-PLAYER DIFFERENTIAL GAME USING REINFORCEMENT LEARNING AND MULTI-OBJECTIVE GENETIC ALGORITHM
Asgharnia A.
Schwartz H.
Atia M.
International Journal of Innovative Computing, Information and Control, 2022, 18 (06): : 1667 - 1688
[28] Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care
Shirali A.
Schubert A.
Alaa A.
IEEE Journal of Biomedical and Health Informatics, 2024, 28 (10) : 1 - 13
[29] Multimodal transportation routing optimization based on multi-objective Q-learning under time uncertainty
Zhang, Tie
Cheng, Jia
Zou, Yanbiao
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 3133 - 3152
[30] Multimodal transportation routing optimization based on multi-objective Q-learning under time uncertainty
Tie Zhang
Jia Cheng
Yanbiao Zou
Complex & Intelligent Systems, 2024, 10 : 3133 - 3152

← 1 2 3 4 5 →