Deconfounded Opponent Intention Inference for Football Multi-Player Policy Learning

被引:1
|
作者
Wang, Shijie [1 ,2 ]
Pan, Yi [2 ]
Pu, Zhiqiang [1 ,2 ]
Liu, Boyin [1 ,2 ]
Yi, Jianqiang [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
关键词
D O I
10.1109/IROS55552.2023.10341469
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the high complexity of a football match, the opponents' strategies are variable and unknown. Thus predicting the opponents' future intentions accurately based on current situation is crucial for football players' decision-making. To better anticipate the opponents and learn more effective strategies, a deconfounded opponent intention inference (DOII) method for football multi-player policy learning is proposed in this paper. Specifically, opponents' intentions are inferred by an opponent intention supervising module. Furthermore, for some confounders which affect the causal relationship among the players and the opponents, a deconfounded trajectory graph module is designed to mitigate the influence of these confounders and increase the accuracy of the inferences about opponents' intentions. Besides, an opponent-based incentive module is designed to improve the players' sensitivity to the opponents' intentions and further to train reasonable players' strategies. Representative results indicate that DOII can effectively improve the performance of players' strategies in the Google Research Football environment, which validates the superiority of the proposed method.
引用
收藏
页码:8054 / 8061
页数:8
相关论文
共 50 条
  • [21] A Deep Learning and Computer Vision Based Multi-Player Tracker for Squash
    Baclig, Maria Martine
    Ergezinger, Noah
    Mei, Qipei
    Gul, Mustafa
    Adeeb, Samer
    Westover, Lindsey
    APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 16
  • [22] Learning to Correlate in Multi-Player General-Sum Sequential Games
    Celli, Andrea
    Marchesi, Alberto
    Bianchi, Tommaso
    Gatti, Nicola
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [23] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [24] Orbital Multi-Player Pursuit-Evasion Game with Deep Reinforcement Learning
    Zhen-yu Li
    Si Chen
    Chenghong Zhou
    Wei Sun
    The Journal of the Astronautical Sciences, 72 (1)
  • [25] Cooperative control for multi-player pursuit-evasion games with reinforcement learning
    Wang, Yuanda
    Dong, Lu
    Sun, Changyin
    NEUROCOMPUTING, 2020, 412 : 101 - 114
  • [26] A reinforcement learning algorithm for obtaining the Nash equilibrium of multi-player matrix games
    Nanduri, Vishnu
    Das, Tapas K.
    IIE TRANSACTIONS, 2009, 41 (02) : 158 - 167
  • [27] Iterative ADP learning algorithms for discrete-time multi-player games
    Jiang, He
    Zhang, Huaguang
    ARTIFICIAL INTELLIGENCE REVIEW, 2018, 50 (01) : 75 - 91
  • [28] Iterative ADP learning algorithms for discrete-time multi-player games
    He Jiang
    Huaguang Zhang
    Artificial Intelligence Review, 2018, 50 : 75 - 91
  • [29] An Online Learning Approach to a Multi-player N-armed Functional Bandit
    O'Neill, Sam
    Bagdasar, Ovidiu
    Liotta, Antonio
    NUMERICAL COMPUTATIONS: THEORY AND ALGORITHMS, PT II, 2020, 11974 : 438 - 445
  • [30] Off-Policy Model-Free Learning for Multi-Player Non-Zero-Sum Games With Constrained Inputs
    Huo, Yu
    Wang, Ding
    Qiao, Junfei
    Li, Menghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (02) : 910 - 920