A Multi-Teacher Policy Distillation Framework for Enhancing Zero-Shot Generalization of Autonomous Driving Policies

被引:0
|
作者
Yang, Jiachen [1 ]
Zhang, Jipeng [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Training; Autonomous vehicles; Statistics; Sociology; Regulation; Robustness; Manuals; Zero-shot generalization; multi-teacher policy distillation; k-determinantal point process; gradient matching; regulation mechanism; REINFORCEMENT; GO;
D O I
10.1109/TVT.2024.3379972
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generating reliable autonomous driving policies is an important goal in developing future transportation systems. Deep reinforcement learning has the potential to achieve this goal. However, policies generated by conventional deep reinforcement learning suffer from poor zero-shot generalization in the face of changes in environment conditions. Domain randomization may provide a solution, yet it brings a high degree of variance and unpredictability to the training process. We utilize multi-teacher policy distillation to circumvent this risk. However, conventional multi-teacher policy distillation paradigms present some issues. First, the teacher agent population is not diverse enough to provide the student agent with various knowledge. Second, the student agent only learns the output activations of teacher agents, without fully leveraging the guidance of teacher agents. Third, some teacher agents may dominate the training process, leading to the neglect of knowledge imparted by other teacher agents. To address these issues, we propose a three-stage multi-teacher policy distillation framework. The first stage is based on a k-determinantal point process. Training environments with dissimilar parameter settings are selected, diversifying the pre-trained teacher agents and providing the student agent with various knowledge. In the second stage, a gradient matching mechanism is applied to enable the student agent to benefit from gradients from multiple teacher agents. In the last stage, we propose a regulation mechanism to adaptively adjust the impact of each teacher agent on the student agent. This mechanism ensures balanced influence from each teacher agent. Experimental results show that our proposed framework effectively improves zero-shot generalization performance in environments with unseen conditions. Additionally, we analyze the influence of some key factors of our proposed framework.
引用
收藏
页码:9734 / 9746
页数:13
相关论文
共 21 条
  • [1] PAMK: Prototype Augmented Multi-Teacher Knowledge Transfer Network for Continual Zero-Shot Learning
    Lu, Junxin
    Sun, Shiliang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 3353 - 3368
  • [2] Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control
    Xu, Zhuo
    Tang, Chen
    Tomizuka, Masayoshi
    [J]. 2018 21ST INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2018, : 2865 - 2871
  • [3] SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
    Fan, Linxi
    Wang, Guanzhi
    Huang, De-An
    Yu, Zhiding
    Li Fei-Fei
    Zhu, Yuke
    Anandkumar, Anima
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Improving Zero-shot Generalization and Robustness of Multi-modal Models
    Ge, Yunhao
    Ren, Jie
    Gallagher, Andrew
    Wang, Yuxiao
    Yang, Ming-Hsuan
    Adam, Hartwig
    Itti, Laurent
    Lakshminarayanan, Balaji
    Zhao, Raping
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11093 - 11101
  • [5] A Probabilistic Framework for Zero-Shot Multi-Label Learning
    Gaure, Abhilash
    Gupta, Aishwarya
    Verma, Vinay Kumar
    Rai, Piyush
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [6] MKD-Cooper: Cooperative 3D Object Detection for Autonomous Driving via Multi-Teacher Knowledge Distillation
    Li, Zhiyuan
    Liang, Huawei
    Wang, Hanqi
    Zhao, Mingzhuo
    Wang, Jian
    Zheng, Xiaokun
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1490 - 1500
  • [7] Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
    Oh, Junhyuk
    Singh, Satinder
    Lee, Honglak
    Kohli, Pushmeet
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [8] A Multi-teacher Knowledge Distillation Framework for Distantly Supervised Relation Extraction with Flexible Temperature
    Fei, Hongxiao
    Tan, Yangying
    Huang, Wenti
    Long, Jun
    Huang, Jincai
    Yang, Liu
    [J]. WEB AND BIG DATA, PT II, APWEB-WAIM 2023, 2024, 14332 : 103 - 116
  • [9] A Transferable Generative Framework for Multi-Label Zero-Shot Learning
    Ma, Peirong
    He, Zhiquan
    Ran, Wu
    Lu, Hong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3409 - 3423
  • [10] Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
    Jeon, Yejin
    Kim, Yunsu
    Lee, Gary Geunbae
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18336 - 18344