Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning

被引:0
|
作者
Xie, Yuansheng [1 ]
Vosoughi, Soroush [1 ]
Hassanpour, Saeed [2 ]
机构
[1] Dartmouth Coll, Dept Comp Sci, Hanover, NH 03755 USA
[2] Dartmouth Coll, Dept Biomed Data Sci, Hanover, NH 03755 USA
关键词
Adversarial Inverse Reinforcment Learning; Natural Language Processing; Abstractive Summarization; BLACK-BOX;
D O I
10.1109/ICPR56361.2022.9956245
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial Intelligence, particularly through recent advancements in deep learning (DL), has achieved exceptional performances in many tasks in fields such as natural language processing and computer vision. For certain high-stake domains, in addition to desirable performance metrics, a high level of interpretability is often required in order for AI to be reliably utilized. Unfortunately, the black box nature of DL models prevents researchers from providing explicative descriptions for a DL model's reasoning process and decisions. In this work, we propose a novel framework utilizing Adversarial Inverse Reinforcement Learning that can provide global explanations for decisions made by a Reinforcement Learning model and capture intuitive tendencies that the model follows by summarizing the model's decision-making process.
引用
收藏
页码:5067 / 5074
页数:8
相关论文
共 50 条
  • [1] Advancements in Deep Reinforcement Learning and Inverse Reinforcement Learning for Robotic Manipulation: Toward Trustworthy, Interpretable, and Explainable Artificial Intelligence
    Ozalp, Recep
    Ucar, Aysegul
    Guzelis, Cuneyt
    [J]. IEEE ACCESS, 2024, 12 : 51840 - 51858
  • [2] Inverse Reinforcement Learning via Deep Gaussian Process
    Jin, Ming
    Damianou, Andreas
    Abbeel, Pieter
    Spanos, Costas
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [3] Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement Learning
    Wei, Jiawen
    Qiu, Zhifeng
    Wang, Fangyuan
    Lin, Wenwei
    Gui, Ning
    Gui, Weihua
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 1696 - 1707
  • [4] Deep learning, reinforcement learning, and world models
    Matsuo, Yutaka
    LeCun, Yann
    Sahani, Maneesh
    Precup, Doina
    Silver, David
    Sugiyama, Masashi
    Uchibe, Eiji
    Morimoto, Jun
    [J]. NEURAL NETWORKS, 2022, 152 : 267 - 275
  • [5] Learning to Walk via Deep Reinforcement Learning
    Haarnoja, Tuomas
    Ha, Sehoon
    Zhou, Aurick
    Tan, Jie
    Tucker, George
    Levine, Sergey
    [J]. ROBOTICS: SCIENCE AND SYSTEMS XV, 2019,
  • [6] Machining sequence learning via inverse reinforcement learning
    Sugisawa, Yasutomo
    Takasugi, Keigo
    Asakawa, Naoki
    [J]. PRECISION ENGINEERING-JOURNAL OF THE INTERNATIONAL SOCIETIES FOR PRECISION ENGINEERING AND NANOTECHNOLOGY, 2022, 73 : 477 - 487
  • [7] Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning
    You, Changxi
    Lu, Jianbo
    Filev, Dimitar
    Tsiotras, Panagiotis
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2019, 114 : 1 - 18
  • [8] Deep Inverse Reinforcement Learning for Objective Function Identification in Bidding Models
    Guo, Hongye
    Chen, Qixin
    Xia, Qing
    Kang, Chongqing
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (06) : 5684 - 5696
  • [9] Interpretable Control by Reinforcement Learning
    Hein, Daniel
    Limmer, Steffen
    Runkler, Thomas A.
    [J]. IFAC PAPERSONLINE, 2020, 53 (02): : 8082 - 8089
  • [10] A survey on interpretable reinforcement learning
    Glanois, Claire
    Weng, Paul
    Zimmer, Matthieu
    Li, Dong
    Yang, Tianpei
    Hao, Jianye
    Liu, Wulong
    [J]. MACHINE LEARNING, 2024, 113 (08) : 5847 - 5890