Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks

被引:54
|
作者
Zhang, Zhen [1 ]
Wang, Dongqing [2 ]
Gao, Junwei [1 ]
机构
[1] Qingdao Univ, Sch Automat, Qingdao 266071, Peoples R China
[2] Qingdao Univ, Sch Elect Engn, Qingdao 266071, Peoples R China
基金
中国国家自然科学基金;
关键词
Games; Learning automata; Task analysis; Stochastic processes; Optimization; Learning (artificial intelligence); Clustering algorithms; multiagent reinforcement learning (MARL); multiagent system; reinforcement learning (RL); EVOLUTIONARY GAME-THEORY; ALGORITHM; CONSENSUS; SYSTEM;
D O I
10.1109/TNNLS.2020.3025711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiagent reinforcement learning (MARL) has been extensively used in many applications for its tractable implementation and task distribution. Learning automata, which can be classified under MARL in the category of independent learner, are used to obtain the optimal joint action or some type of equilibrium. Learning automata have the following advantages. First, learning automata do not require any agent to observe the action of any other agent. Second, learning automata are simple in structure and easy to be implemented. Learning automata have been applied to function optimization, image processing, data clustering, recommender systems, and wireless sensor networks. However, a few learning automata-based algorithms have been proposed for optimization of cooperative repeated games and stochastic games. We propose an algorithm known as learning automata for optimization of cooperative agents (LA-OCA). To make learning automata applicable to cooperative tasks, we transform the environment to a P-model by introducing an indicator variable whose value is one when the maximal reward is obtained and is zero otherwise. Theoretical analysis shows that all the strict optimal joint actions are stable critical points of the model of LA-OCA in cooperative repeated games with an arbitrary finite number of players and actions. Simulation results show that LA-OCA obtains the pure optimal joint strategy with a success rate of 100% in all of the three cooperative tasks and outperforms the other algorithms in terms of learning speed.
引用
收藏
页码:4639 / 4652
页数:14
相关论文
共 50 条
  • [1] Multiagent Reinforcement Learning With Learning Automata for Microgrid Energy Management and Decision Optimization
    Fang, Xiaohan
    Wang, Jinkuan
    Yin, Chunhui
    Han, Yinghua
    Zhao, Qiang
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 779 - 784
  • [2] Learning Cooperative Behaviours in Multiagent Reinforcement Learning
    Phon-Amnuaisuk, Somnuk
    [J]. NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 570 - 579
  • [3] Learning to Teach in Cooperative Multiagent Reinforcement Learning
    Omidshafiei, Shayegan
    Kim, Dong-Ki
    Liu, Miao
    Tesauro, Gerald
    Riemer, Matthew
    Amato, Christopher
    Campbell, Murray
    How, Jonathan P.
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6128 - 6136
  • [4] FMRQ-A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks
    Zhang, Zhen
    Zhao, Dongbin
    Gao, Junwei
    Wang, Dongqing
    Dai, Yujie
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (06) : 1367 - 1379
  • [5] Cooperative channel assignment for VANETs based on multiagent reinforcement learning
    Wang, Yun-peng
    Zheng, Kun-xian
    Tian, Da-xin
    Duan, Xu-ting
    Zhou, Jian-shan
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (07) : 1047 - 1058
  • [6] Cooperative channel assignment for VANETs based on multiagent reinforcement learning
    Yun-peng Wang
    Kun-xian Zheng
    Da-xin Tian
    Xu-ting Duan
    Jian-shan Zhou
    [J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 1047 - 1058
  • [7] The dynamics of reinforcement learning in cooperative multiagent systems
    Claus, C
    Boutilier, C
    [J]. FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 746 - 752
  • [8] Cooperative Multiagent Reinforcement Learning With Partial Observations
    Zhang, Yan
    Zavlanos, Michael M.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (02) : 968 - 981
  • [9] A learning automata-based memetic algorithm
    M. Rezapoor Mirsaleh
    M. R. Meybodi
    [J]. Genetic Programming and Evolvable Machines, 2015, 16 : 399 - 453
  • [10] A learning automata-based memetic algorithm
    Mirsaleh, M. Rezapoor
    Meybodi, M. R.
    [J]. GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2015, 16 (04) : 399 - 453