Interactive Reinforcement Learning Strategy

被引：1

作者：

Shi, Zhenjie ^{[1
]}

Ma, Wenming ^{[1
]}

Yin, Shuai ^{[1
]}

Zhang, Hailiang ^{[1
]}

Zhao, Xiaofan ^{[1
]}

机构：

[1] Yantai Univ, Sch Comp & Control Engn, Yantai, Peoples R China

来源：

2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021) | 2021年

关键词：

Reinforcement learning; interactive learning; path planning; Q-learning;

D O I：

10.1109/SWC50871.2021.00075

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The birth of AlphaGo has set off a new wave of reinforcement learning technology. Reinforcement learning has become one of the most popular directions in the field of artificial intelligence. Its essence is the continuous integration and upgrading of various machine learning methods, and the agents continue to trial and error and obtain cumulative rewards. Q-learning is the most commonly used method in reinforcement learning, but it itself has many problems such as less early information, long learning time, low learning efficiency, and repeated trial and error. Therefore, Q-learning cannot be directly applied to the real environment. In response to this problem, the reinforcement learning discussed by the author is an interactive learning method that combines voice commands and Q-learning. This method uses part of the interaction between the agent and the human voice to find a larger target range in the early stage of learning. Then narrow the search range in turn, which can guide the agent to quickly achieve the learning effect and change the blindness of learning. Simulation experiments show that compared with the standard Q-learning algorithm, the proposed algorithm not only improves the convergence speed, shortens the learning time, but also reduces the number of collisions, enabling the agent to quickly find a better collision-free path.

引用

页码：507 / 512

页数：6

共 50 条

[21] A new evolutionary strategy for reinforcement learning
Zaghdoud R.
Boukthir K.
Haddad L.
Hamdani T.M.
Chabchoub H.
Alimi A.M.
Multimedia Tools and Applications, 2025, 84 (4) : 1745 - 1761
[22] Human Feedback as Action Assignment in Interactive Reinforcement Learning
Raza, Syed Ali
Williams, Mary-Anne
ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2020, 14 (04)
[23] An Approach to Interactive Deep Reinforcement Learning for Serious Games
Dobrovsky, Aline
Borghoff, Uwe M.
Hofmann, Marko
2016 7TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2016, : 85 - 90
[24] Reinforcement learning using continuous states and interactive feedback
Ayala, Angel
Henriquez, Claudio
Cruz, Francisco
PROCEEDINGS OF 2ND INTERNATIONAL CONFERENCE ON APPLICATIONS OF INTELLIGENT SYSTEMS (APPIS 2019), 2019,
[25] An interactive food recommendation system using reinforcement learning
Liu, Liangliang
Guan, Yi
Wang, Zi
Shen, Rujia
Zheng, Guowei
Fu, Xuelian
Yu, Xuehui
Jiang, Jingchi
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 254
[26] Persistent rule-based interactive reinforcement learning
Bignold, Adam
Cruz, Francisco
Dazeley, Richard
Vamplew, Peter
Foale, Cameron
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (32): : 23411 - 23428
[27] An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users
Bignold, Adam
Cruz, Francisco
Dazeley, Richard
Vamplew, Peter
Foale, Cameron
BIOMIMETICS, 2021, 6 (01) : 1 - 15
[28] Interactive Spoken Content Retrieval by Deep Reinforcement Learning
Wu, Yen-Chen
Lin, Tzu-Hsiang
Chen, Yang-De
Lee, Hung-Yi
Lee, Lin-Shan
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 943 - 947
[29] User modelling using evolutionary interactive reinforcement learning
H. O. Nyongesa
S. Maleki-dizaji
Information Retrieval, 2006, 9 : 343 - 355
[30] Training Agents With Interactive Reinforcement Learning and Contextual Affordances
Cruz, Francisco
Magg, Sven
Weber, Cornelius
Wermter, Stefan
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2016, 8 (04) : 271 - 284

← 1 2 3 4 5 →