Adaptive early classification of temporal sequences using deep reinforcement learning

被引:22
|
作者
Martinez, Coralie [1 ]
Ramasso, Emmanuel [2 ]
Perrin, Guillaume [1 ]
Rombaut, Michele [3 ]
机构
[1] bioMerieux, Marcy Letoile, France
[2] Univ Bourgogne Franche Comte, FEMTO ST Inst, Besancon, France
[3] Univ Grenoble Alpes, GIPSA Lab, Grenoble Inst Engn, Grenoble, France
关键词
Early classification; Adaptive prediction time; Deep reinforcement learning; Temporal sequences; Double DQN; Trade-off between accuracy vs. speed;
D O I
10.1016/j.knosys.2019.105290
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we address the problem of early classification (EC) of temporal sequences with adaptive prediction times. We frame EC as a sequential decision making problem and we define a partially observable Markov decision process (POMDP) fitting the competitive objectives of classification earliness and accuracy. We solve the POMDP by training an agent for EC with deep reinforcement learning (DRL). The agent learns to make adaptive decisions between classifying incomplete sequences now or delaying its prediction to gather more measurements. We adapt an existing DRL algorithm for batch and online learning of the agent's action value function with a deep neural network. We propose strategies of prioritized sampling, prioritized storing and random episode initialization to address the fact that the agent's memory is unbalanced due to (1): all but one of its actions terminate the process and thus (2): actions of classification are less frequent than the action of delay. In experiments, we show improvements in accuracy induced by our specific adaptation of the algorithm used for online learning of the agents action value function. Moreover, we compare two definitions of the POMDP based on delay reward shaping against reward discounting. Finally, we demonstrate that a static naive deep neural network, i.e. trained to classify at static times, is less efficient in terms of accuracy against speed than the equivalent network trained with adaptive decision making capabilities. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Deep reinforcement learning for adaptive mesh refinement
    Foucart, Corbin
    Charous, Aaron
    Lermusiaux, Pierre F. J.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2023, 491
  • [32] Deep Reinforcement Learning for Object Segmentation in Video Sequences
    Sahba, Farhang
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 857 - 860
  • [33] DEEP LEARNING CLASSIFICATION OF PROSTATE MRI SEQUENCES
    Bhatter, P.
    Bardis, M.
    Chahine, C.
    Ushinsky, A.
    Fujimoto, D.
    Grant, W. A.
    Chang, P.
    Houshyar, R.
    JOURNAL OF INVESTIGATIVE MEDICINE, 2020, 68 : A134 - A135
  • [34] Temporal signed gestures segmentation in an image sequence using deep reinforcement learning
    Kalandyk, Dawid
    Kapuscinski, Tomasz
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 131
  • [35] Cost Optimization at Early Stages of Design Using Deep Reinforcement Learning
    Servadei, Lorenzo
    Zheng, Jiapeng
    Arjona-Medina, Jose
    Werner, Michael
    Esen, Volkan
    Hochreiter, Sepp
    Ecker, Wolfgang
    Wille, Robert
    PROCEEDINGS OF THE 2020 ACM/IEEE 2ND WORKSHOP ON MACHINE LEARNING FOR CAD (MLCAD '20), 2020, : 37 - 42
  • [36] Parameterized Adaptive Controller Design using Reinforcement Learning and Deep Neural Networks
    Kumar, Kranthi P.
    Detroja, Ketan P.
    2022 EIGHTH INDIAN CONTROL CONFERENCE, ICC, 2022, : 121 - 126
  • [37] Adaptive satellite attitude control for varying masses using deep reinforcement learning
    Retagne, Wiebke
    Dauer, Jonas
    Waxenegger-Wilfing, Guenther
    FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [38] Adaptive control for circulating cooling water system using deep reinforcement learning
    Xu, Jin
    Li, Han
    Zhang, Qingxin
    PLOS ONE, 2024, 19 (07):
  • [39] Wavefront sensor-less adaptive optics using deep reinforcement learning
    Durech, Eduard
    Newberry, William
    Franke, Jonas
    Sarunic, Marinko, V
    BIOMEDICAL OPTICS EXPRESS, 2021, 12 (09) : 5423 - 5438
  • [40] Adaptive Incident Radiance Field Sampling and Reconstruction Using Deep Reinforcement Learning
    Huo, Yuchi
    Wang, Rui
    Zheng, Ruzahng
    Xu, Hualin
    Bao, Hujun
    Yoon, Sung-Eui
    ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (01):