Deconfounded Opponent Intention Inference for Football Multi-Player Policy Learning

被引:1
|
作者
Wang, Shijie [1 ,2 ]
Pan, Yi [2 ]
Pu, Zhiqiang [1 ,2 ]
Liu, Boyin [1 ,2 ]
Yi, Jianqiang [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
关键词
D O I
10.1109/IROS55552.2023.10341469
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high complexity of a football match, the opponents' strategies are variable and unknown. Thus predicting the opponents' future intentions accurately based on current situation is crucial for football players' decision-making. To better anticipate the opponents and learn more effective strategies, a deconfounded opponent intention inference (DOII) method for football multi-player policy learning is proposed in this paper. Specifically, opponents' intentions are inferred by an opponent intention supervising module. Furthermore, for some confounders which affect the causal relationship among the players and the opponents, a deconfounded trajectory graph module is designed to mitigate the influence of these confounders and increase the accuracy of the inferences about opponents' intentions. Besides, an opponent-based incentive module is designed to improve the players' sensitivity to the opponents' intentions and further to train reasonable players' strategies. Representative results indicate that DOII can effectively improve the performance of players' strategies in the Google Research Football environment, which validates the superiority of the proposed method.
引用
收藏
页码:8054 / 8061
页数:8
相关论文
共 50 条
  • [31] Multi-player evolutionary game of federated learning incentive mechanism based on system dynamics
    Yang, Pengxi
    Zhang, Hua
    Gao, Fei
    Xu, Yanxin
    Jin, Zhengping
    NEUROCOMPUTING, 2023, 557
  • [32] Finite-Time Last-Iterate Convergence for Learning in Multi-Player Games
    Cai, Yang
    Oikonomou, Argyris
    Zheng, Weiqiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] MEMORABLE: A Multi-playEr custoMisable seriOus Game fRAmework for cyBer-security LEarning
    Wang, Jingyun
    Hodgson, Ryan
    Cristea, Alexandra, I
    INTELLIGENT TUTORING SYSTEMS, ITS 2022, 2022, 13284 : 313 - 322
  • [34] Model-based reinforcement learning for a multi-player card game with partial observability
    Fujita, H
    Ishii, S
    2005 IEEE/WIC/ACM International Conference on Intelligent Agent Technology, Proceedings, 2005, : 467 - 470
  • [35] Fuzzy Q-learning for a multi-player non-cooperative repeated game
    Ishibuchi, H
    Nakashima, T
    Miyamoto, H
    Oh, CH
    PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 1573 - 1579
  • [36] Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator
    Ren, He
    Zhang, Huaguang
    Wen, Yinlei
    Liu, Chong
    NEUROCOMPUTING, 2019, 335 : 96 - 104
  • [37] M2RL: A Multi-player Multi-agent Reinforcement Learning Framework for Complex Games
    Yu, Tongtong
    He, Chenghua
    Yin, Qiyue
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8847 - 8850
  • [38] Inverse Reinforcement Learning for Multi-player Apprentice Games in Continuous-Time Nonlinear Systems
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    Davoudi, Ali
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 803 - 808
  • [39] Optimized control for human-multi-robot collaborative manipulation via multi-player Q-learning
    Liu, Xing
    Huang, Panfeng
    Ge, Shuzhi Sam
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2021, 358 (11): : 5639 - 5658
  • [40] Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property
    Anagnostides, Ioannis
    Panageas, Ioannis
    Farina, Gabriele
    Sandholm, Tuomas
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 9451 - 9459