Automatic Curriculum Design for Object Transportation Based on Deep Reinforcement Learning

被引：3

作者：

Eoh, Gyuho ^{[1
]}

Park, Tae-Hyoung ^{[1
]}

机构：

[1] Chungbuk Natl Univ, Ind AI Res Ctr, Cheongju 28116, South Korea

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Transportation; Robots; Training; Grasping; Reinforcement learning; Tools; Task analysis; Curriculum learning; object transportation; deep reinforcement learning; difficulty level; UNKNOWN OBJECTS; COOPERATIVE TRANSPORT; MOBILE ROBOTS; MANIPULATION;

D O I：

10.1109/ACCESS.2021.3118109

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents an automatic curriculum learning (ACL) method for object transportation based on deep reinforcement learning (DRL). Previous studies on object transportation using DRL have a sparse reward problem that an agent receives a rare reward for only the transportation completion of an object. Generally, curriculum learning (CL) has been used to solve the sparse reward problem. However, the conventional CL methods should be manually designed by users, which is difficult and tedious work. Moreover, there were no standard CL methods for object transportation. Therefore, we propose an ACL method for object transportation in which human intervention is unnecessary at the training step. A robot automatically designs curricula itself and iteratively trains according to the curricula. First, we define the difficult level of object transportation using a map, which is determined by the predicted travelling distance of an object and the existence of obstacles and walls. In the beginning, a robot learns the object transportation at an easy level (i.e., travelling distance is short and there are less obstacles around), then learns a difficult task (i.e., the long travelling distance of an object is required and there are many obstacles around). Second, training time also affects the performance of object transportation, and thus, we suggest an adaptive determining method of the number of training episodes. The number of episodes for training is adaptively determined based on the current success rate of object transportation. We verified the proposed method in simulation environments, and the success rate of the proposed method was 14% higher than no-curriculum. Also, the proposed method showed 63% (maximum) and 14% (minimum) higher success rates compared with the manual curriculum methods. Additionally, we conducted real experiments to verify the gap between simulation and practical results.

引用

页码：137281 / 137294

页数：14

共 50 条

[31] Deep Reinforcement Learning for 3d-Based Object Grasping
Universidade da Beira Interior
[J]. 1600,
[32] Cognitive GPR for Subsurface Object Detection Based on Deep Reinforcement Learning
Omwenga, Maxwell M.
Wu, Dalei
Liang, Yu
Yang, Li
Huston, Dryver
Xia, Tian
[J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (14): : 11594 - 11606
[33] Automatic View Generation with Deep Learning and Reinforcement Learning
Yuan, Haitao
Li, Guoliang
Feng, Ling
Sun, Ji
Han, Yue
[J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1501 - 1512
[34] Active Object Localization with Deep Reinforcement Learning
Caicedo, Juan C.
Lazebnik, Svetlana
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2488 - 2496
[35] Deep Adversarial Reinforcement Learning for Object Disentangling
Laux, Melvin
Arenz, Oleg
Peters, Jan
Pajarinen, Joni
[J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5504 - 5510
[36] Automatic design of hyper-heuristic based on reinforcement learning
Choong, Shin Siang
Wong, Li-Pei
Lim, Chee Peng
[J]. INFORMATION SCIENCES, 2018, 436 : 89 - 107
[37] Automatic Truss Design with Reinforcement Learning
Du, Weihua
Zhao, Jinglun
Yu, Chao
Yao, Xingcheng
Song, Zimeng
Wu, Siyang
Luo, Ruifeng
Liu, Zhiyuan
Zhao, Xianzhong
Wu, Yi
[J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 3659 - 3667
[38] Piano harmony automatic adaptation system based on deep reinforcement learning
Guo, Hui
[J]. ENTERTAINMENT COMPUTING, 2025, 52
[39] DEEP REINFORCEMENT LEARNING-BASED AUTOMATIC TEST PATTERN GENERATION
Li, Wenxing
Lyu, Hongqin
Liang, Shengwen
Liu, Zizhen
Lin, Ning
Wang, Zhongrui
Tian, Pengyu
Wang, Tiancheng
Li, Huawei
[J]. CONFERENCE OF SCIENCE & TECHNOLOGY FOR INTEGRATED CIRCUITS, 2024 CSTIC, 2024,
[40] Automatic Generation Control Based on Deep Reinforcement Learning With Exploration Awareness
Xi L.
Yu L.
Fu Y.
Huang Y.
Chen X.
Kang S.
[J]. Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2019, 39 (14): : 4150 - 4161

← 1 2 3 4 5 →