Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

被引：0

作者：

Sunel, Saim ^{[1
]}

Cilden, Erkin ^{[2
]}

Polat, Faruk ^{[1
]}

机构：

[1] Middle East Tech Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

[2] STM Def Technol Engn & Trade Inc, RF & Simulat Syst Directorate, Ankara, Turkiye

来源：

ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS | 2024年 / 19卷 / 02期

关键词：

Subgoal identification; expectation-maximization; diverse density; hyper-parameter search; multiple instance learning; reinforcement learning; DISCOVERY; FRAMEWORK;

D O I：

10.1145/3643852

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Variousmethods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, first, we introduce a newMIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL's Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Second, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Third, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.

引用

页数：29

共 50 条

[1] Subgoal identification for reinforcement learning and planning in multiagent problem solving
Chiu, Chung-Cheng
Soo, Von-Wun
MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2007, 4687 : 37 - +
[2] Robust MIL-Based Feature Template Learning for Object Tracking
Lan, Xiangyuan
Yuen, Pong C.
Chellappa, Rama
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4118 - 4125
[3] On the importance of hyperparameters tuning for model-free reinforcement learning algorithms
Tejer, Mateusz
Szezepanski, Rafal
2024 12TH INTERNATIONAL CONFERENCE ON CONTROL, MECHATRONICS AND AUTOMATION, ICCMA, 2024, : 78 - 82
[4] Hierarchical Reinforcement Learning Based on Continuous Subgoal Space
Wang, Chen
Zeng, Fanyu
Ge, Shuzhi Sam
Jiang, Xin
2020 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (IEEE-RCAR 2020), 2020, : 74 - 80
[5] Reinforcement learning transfer based on subgoal discovery and subtask similarity
Wang, Hao
Fan, Shunguo
Song, Jinhua
Gao, Yang
Chen, Xingguo
IEEE/CAA Journal of Automatica Sinica, 2014, 1 (03) : 257 - 266
[6] Reinforcement Learning Transfer Based on Subgoal Discovery and Subtask Similarity
Hao Wang
Shunguo Fan
Jinhua Song
Yang Gao
Xingguo Chen
IEEE/CAAJournalofAutomaticaSinica, 2014, 1 (03) : 257 - 266
[7] Subgoal-Based Reward Shaping to Improve Efficiency in Reinforcement Learning
Okudo, Takato
Yamada, Seiji
IEEE ACCESS, 2021, 9 : 97557 - 97568
[8] Connect-based subgoal discovery for options in hierarchical reinforcement learning
Chen, Fei
Gao, Yang
Chen, Shifu
Ma, Zhenduo
ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2007, : 698 - +
[9] A MIL-based framework via contrastive instance learning and multimodal learning for long-term ECG classification
Han, Haozhan
Lian, Cheng
Xu, Bingrong
Zeng, Zhigang
Alhudhaif, Adi
Polat, Kemal
APPLIED SOFT COMPUTING, 2024, 167
[10] Reinforcement learning-based tuning algorithm applied to fuzzy identification
Cerrada, Maxiela
Aguilar, Jose
Titli, Andre
ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 623 - 630

← 1 2 3 4 5 →