Regularized Soft Actor-Critic for Behavior Transfer Learning

被引:0
|
作者
Tan, Mingxi [1 ]
Tian, Andong [1 ]
Denoyer, Ludovic [1 ]
机构
[1] Ubisoft, Ubisoft La Forge, Chengdu, Peoples R China
关键词
CMDP; behavior style; video game;
D O I
10.1109/CoG51982.2022.9893655
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Existing imitation learning methods mainly focus on making an agent effectively mimic a demonstrated behavior, but do not address the potential contradiction between the behavior style and the objective of a task. There is a general lack of efficient methods that allow an agent to partially imitate a demonstrated behavior to varying degrees, while completing the main objective of a task. In this paper we propose a method called Regularized Soft Actor-Critic which formulates the main task and the imitation task under the Constrained Markov Decision Process framework (CMDP). The main task is defined as the maximum entropy objective used in Soft Actor-Critic (SAC) and the imitation task is defined as a constraint. We evaluate our method on continuous control tasks relevant to video games applications.
引用
收藏
页码:516 / 519
页数:4
相关论文
共 50 条
  • [21] Divergence-Regularized Multi-Agent Actor-Critic
    Su, Kefan
    Lu, Zongqing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [22] TRANSFER LEARNING BASED ON FORBIDDEN RULE SET IN ACTOR-CRITIC METHOD
    Takano, Toshiaki
    Takase, Haruhiko
    Kawanaka, Hiroharu
    Kita, Hidehiko
    Hayashi, Terumine
    Tsuruoka, Shinji
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2011, 7 (5B): : 2907 - 2917
  • [23] Generalized Offline Actor-Critic with Behavior Regularization
    Cheng Y.-H.
    Huang L.-Y.
    Hou D.-Y.
    Zhang J.-Z.
    Chen J.-L.
    Wang X.-S.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (04): : 843 - 855
  • [24] Influences of Reinforcement and Choice Histories on Choice Behavior in Actor-Critic Learning
    Katahira K.
    Kimura K.
    Computational Brain & Behavior, 2023, 6 (2) : 172 - 194
  • [25] SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY
    Yan, Tao
    Zhang, Wenan
    Yang, Simon X.
    Yu, Li
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2019, 34 (05): : 536 - 543
  • [26] SAC-FACT: Soft Actor-Critic Reinforcement Learning for Counterfactual Explanations
    Ezzeddine, Fatima
    Ayoub, Omran
    Andreoletti, Davide
    Giordano, Silvia
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, XAI 2023, PT I, 2023, 1901 : 195 - 216
  • [27] CONTROLLED SENSING AND ANOMALY DETECTION VIA SOFT ACTOR-CRITIC REINFORCEMENT LEARNING
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4198 - 4202
  • [28] Optimal Scheduling of Regional Integrated Energy System Based on Advantage Learning Soft Actor-critic Algorithm and Transfer Learning
    Luo W.
    Zhang J.
    He Y.
    Gu T.
    Nie X.
    Fan L.
    Yuan X.
    Li B.
    Dianwang Jishu/Power System Technology, 2023, 47 (04): : 1601 - 1611
  • [29] Reinforcement learning for automatic quadrilateral mesh generation: A soft actor-critic approach
    Pan, Jie
    Huang, Jingwei
    Cheng, Gengdong
    Zeng, Yong
    NEURAL NETWORKS, 2023, 157 : 288 - 304
  • [30] Multi-actor mechanism for actor-critic reinforcement learning
    Li, Lin
    Li, Yuze
    Wei, Wei
    Zhang, Yujia
    Liang, Jiye
    INFORMATION SCIENCES, 2023, 647