Interactive Q-learning on heterogeneous agents system for autonomous adaptive interface

被引:0
|
作者
Ishiwaka, Y [1 ]
Yokoi, H [1 ]
Kakazu, Y [1 ]
机构
[1] Hakodate Natl Coll Technol, Dept Informat Engn, Hakodate, Hokkaido 0428501, Japan
关键词
Interactive Q-learning (IQL); POSMDP; heterogeneous multiagent system; Khepera;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose of this system is to adapt the bedridden people who cannot move their body easily, so the simple reinforcement signals are applied. The application is to control the behaviors of Khepera robot, which is a small mobile robot. For the simple reinforcement signals the on-off signals are employed when the operators as the training agent feels discomfort for the behaviors of the learning agent Khepera robot. We proposed the new reinforcement learning method called Interactive Q-learning and the heterogeneous multi agent system. Our multi agent system has three kinds of heterogeneous single agent: Learning agent, Training agent and Interface Agent. The system is hierarchic. There are also three hierarchies. It is impossible to iterate the many episodes and steps to converge the learning which is adopted in general reinforcement learning in simulation world. We show the results of experiments using the Khepera robot for 3 examinees, and discuss how to give the rewards according to each operator and the significance of heterogeneous multi agent system. We confirmed the effectiveness through the some experiments which are to control the behavior of Khepera robot in real world. The convergences of our teaming system are quite quick. Furthermore the importance of the interface agent is indicated. The individual differences for the timing to give the penalties are happened even though all operators are young.
引用
收藏
页码:475 / 484
页数:10
相关论文
共 50 条
  • [41] Enhanced continuous valued Q-learning for real autonomous robots
    Takeda, M
    Nakamura, T
    Imai, M
    Ogasawara, T
    Asada, M
    ADVANCED ROBOTICS, 2000, 14 (05) : 439 - 441
  • [42] Autonomous algorithmic collusion: Q-learning under sequential pricing
    Klein, Timo
    RAND JOURNAL OF ECONOMICS, 2021, 52 (03): : 538 - 558
  • [43] Adaptive and Dynamic Service Composition Using Q-Learning
    Wang, Hongbing
    Zhou, Xuan
    Zhou, Xiang
    Liu, Weihong
    Li, Wenya
    22ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2010), PROCEEDINGS, VOL 1, 2010,
  • [44] Adaptive sensor-planning algorithm with Q-learning
    Maeda, M
    Kato, N
    Kashimura, H
    2004 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2004, : 966 - 969
  • [45] Adaptive PID controller based on Q-learning algorithm
    Shi, Qian
    Lam, Hak-Keung
    Xiao, Bo
    Tsai, Shun-Hung
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2018, 3 (04) : 235 - 244
  • [46] Adaptive play Q-Learning with initial heuristic approximation
    Burkov, Andriy
    Chaib-draa, Brahim
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-10, 2007, : 1749 - +
  • [47] Evaluation of Q-Learning approach for HTTP Adaptive Streaming
    Martin, Virginia
    Cabrera, Julian
    Garcia, Narciso
    2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2016,
  • [48] On-policy Q-learning for Adaptive Optimal Control
    Jha, Sumit Kumar
    Bhasin, Shubhendu
    2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014, : 301 - 306
  • [49] An adaptive honeypot using Q-Learning with severity analyzer
    Shraddha Suratkar
    Kunjal Shah
    Aditya Sood
    Anay Loya
    Dhaval Bisure
    Umesh Patil
    Faruk Kazi
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 4865 - 4876
  • [50] QLAR: A Q-Learning based Adaptive Routing for MANETs
    Serhani, Abdellatif
    Naja, Najib
    Jamali, Abdellah
    2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,