Interactive Reinforcement Learning Strategy

被引：1

作者：

Shi, Zhenjie ^{[1
]}

Ma, Wenming ^{[1
]}

Yin, Shuai ^{[1
]}

Zhang, Hailiang ^{[1
]}

Zhao, Xiaofan ^{[1
]}

机构：

[1] Yantai Univ, Sch Comp & Control Engn, Yantai, Peoples R China

来源：

2021 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, INTERNET OF PEOPLE, AND SMART CITY INNOVATIONS (SMARTWORLD/SCALCOM/UIC/ATC/IOP/SCI 2021) | 2021年

关键词：

Reinforcement learning; interactive learning; path planning; Q-learning;

D O I：

10.1109/SWC50871.2021.00075

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The birth of AlphaGo has set off a new wave of reinforcement learning technology. Reinforcement learning has become one of the most popular directions in the field of artificial intelligence. Its essence is the continuous integration and upgrading of various machine learning methods, and the agents continue to trial and error and obtain cumulative rewards. Q-learning is the most commonly used method in reinforcement learning, but it itself has many problems such as less early information, long learning time, low learning efficiency, and repeated trial and error. Therefore, Q-learning cannot be directly applied to the real environment. In response to this problem, the reinforcement learning discussed by the author is an interactive learning method that combines voice commands and Q-learning. This method uses part of the interaction between the agent and the human voice to find a larger target range in the early stage of learning. Then narrow the search range in turn, which can guide the agent to quickly achieve the learning effect and change the blindness of learning. Simulation experiments show that compared with the standard Q-learning algorithm, the proposed algorithm not only improves the convergence speed, shortens the learning time, but also reduces the number of collisions, enabling the agent to quickly find a better collision-free path.

引用

页码：507 / 512

页数：6

共 50 条

[1] Interactive Reinforcement Learning with Inaccurate Feedback
Faulkner, Thylor A. Kessler
Short, Elaine Schaertl
Thomaz, Andrea L.
2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 7498 - 7504
[2] An experimental study on interactive reinforcement learning
Nakashima, Tomoharu
Nakamura, Yosuke
Uenishi, Takesuke
Narimoto, Yosuke
PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 735 - 740
[3] Creating Interactive Crowds with Reinforcement Learning
Kwiatkowski, Ariel
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12886 - 12887
[4] The reinforcement learning Kelly strategy
Jiang, R.
Saunders, D.
Weng, C.
QUANTITATIVE FINANCE, 2022, 22 (08) : 1445 - 1464
[5] Exploration from Demonstration for Interactive Reinforcement Learning
Subramanian, Kaushik
Isbell, Charles L., Jr.
Thomaz, Andrea L.
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 447 - 456
[6] Interactive Reinforcement Learning from Imperfect Teachers
Faulkner, Taylor A. Kessler
Thomaz, Andrea
HRI '21: COMPANION OF THE 2021 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2021, : 577 - 579
[7] Interactive multiagent reinforcement learning with motivation rules
Yamaguchi, T
Marukawa, R
ICCIMA 2001: FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2001, : 128 - 132
[8] Interactive preference analysis: A reinforcement learning framework
Hu, Xiao
Kang, Siqin
Ren, Long
Zhu, Shaokeng
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 319 (03) : 983 - 998
[9] Towards interactive reinforcement learning with intrinsic feedback
Poole, Benjamin
Lee, Minwoo
NEUROCOMPUTING, 2024, 587
[10] Interactive relational reinforcement learning of concept semantics
Nickles, Matthias
Rettinger, Achim
MACHINE LEARNING, 2014, 94 (02) : 169 - 204

← 1 2 3 4 5 →