Adaptive early classification of temporal sequences using deep reinforcement learning

被引:22
|
作者
Martinez, Coralie [1 ]
Ramasso, Emmanuel [2 ]
Perrin, Guillaume [1 ]
Rombaut, Michele [3 ]
机构
[1] bioMerieux, Marcy Letoile, France
[2] Univ Bourgogne Franche Comte, FEMTO ST Inst, Besancon, France
[3] Univ Grenoble Alpes, GIPSA Lab, Grenoble Inst Engn, Grenoble, France
关键词
Early classification; Adaptive prediction time; Deep reinforcement learning; Temporal sequences; Double DQN; Trade-off between accuracy vs. speed;
D O I
10.1016/j.knosys.2019.105290
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we address the problem of early classification (EC) of temporal sequences with adaptive prediction times. We frame EC as a sequential decision making problem and we define a partially observable Markov decision process (POMDP) fitting the competitive objectives of classification earliness and accuracy. We solve the POMDP by training an agent for EC with deep reinforcement learning (DRL). The agent learns to make adaptive decisions between classifying incomplete sequences now or delaying its prediction to gather more measurements. We adapt an existing DRL algorithm for batch and online learning of the agent's action value function with a deep neural network. We propose strategies of prioritized sampling, prioritized storing and random episode initialization to address the fact that the agent's memory is unbalanced due to (1): all but one of its actions terminate the process and thus (2): actions of classification are less frequent than the action of delay. In experiments, we show improvements in accuracy induced by our specific adaptation of the algorithm used for online learning of the agents action value function. Moreover, we compare two definitions of the POMDP based on delay reward shaping against reward discounting. Finally, we demonstrate that a static naive deep neural network, i.e. trained to classify at static times, is less efficient in terms of accuracy against speed than the equivalent network trained with adaptive decision making capabilities. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] Pattern classification using fuzzy adaptive learning control network and reinforcement learning
    Quah, KH
    Quek, C
    Leedham, G
    ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 1439 - 1443
  • [12] Learning to adapt: Rational personalization of adaptive therapy using deep reinforcement learning
    Gallagher, Kit
    Strobl, Maximilian
    Gatenby, Robert
    Maini, Philip
    Anderson, Alexander
    CANCER RESEARCH, 2024, 84 (03)
  • [13] Adaptive traffic light control using deep reinforcement learning technique
    Kumar, Ritesh
    Sharma, Nistala Venkata Kameshwer
    Chaurasiya, Vijay K.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) : 13851 - 13872
  • [14] EmoRL: Continuous Acoustic Emotion Classification using Deep Reinforcement Learning
    Lakomkin, Egor
    Zamani, Mohammad Ali
    Weber, Cornelius
    Magg, Sven
    Wermter, Stefan
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 4445 - 4450
  • [15] Adaptive Power System Emergency Control Using Deep Reinforcement Learning
    Huang, Qiuhua
    Huang, Renke
    Hao, Weituo
    Tan, Jie
    Fan, Rui
    Huang, Zhenyu
    IEEE TRANSACTIONS ON SMART GRID, 2020, 11 (02) : 1171 - 1182
  • [16] Temporal encoding in deep reinforcement learning agents
    Dongyan Lin
    Ann Zixiang Huang
    Blake Aaron Richards
    Scientific Reports, 13
  • [17] Adaptive traffic light control using deep reinforcement learning technique
    Ritesh Kumar
    Nistala Venkata Kameshwer Sharma
    Vijay K. Chaurasiya
    Multimedia Tools and Applications, 2024, 83 : 13851 - 13872
  • [18] ALVS: Adaptive Live Video Streaming using deep reinforcement learning
    Ozcelik, Ihsan Mert
    Ersoy, Cem
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2022, 205
  • [19] ASRL: An Adaptive GPS Sampling Method Using Deep Reinforcement Learning
    Qu, Boting
    Zhao, Mengjiao
    Feng, Jun
    Wang, Xin
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 153 - 158
  • [20] Towards Intelligent Adaptive Edge Caching Using Deep Reinforcement Learning
    Wang, Ting
    Deng, Yuxiang
    Mao, Jiawei
    Chen, Mingsong
    Liu, Gang
    Di, Jieming
    Li, Keqin
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (10) : 9289 - 9303