Learning option MDPs from small data

被引：0

作者：

Zehfroosh, Ashkan ^{[1
]}

Tanner, Herbert G. ^{[1
]}

Heinz, Jeffrey ^{[2
]}

机构：

[1] Univ Delaware, Dept Mech Engn, Newark, DE 19716 USA

[2] SUNY Stony Brook, Dept Linguist & Insti Tute Adv Computat Sci, Stony Brook, NY 11794 USA

来源：

2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC) | 2018年

关键词：

ACQUISITION; INFANTS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Learning from small data is a challenge that presents itself in applications of human-robot interaction (HRI) in the context of pediatric rehabilitation. Discrete models of computation such as an Markov decision process (MDP) can be used to capture the dynamics of HRI, but the parameters of those models are usually unknown and (human) subject dependent. This paper combines an abstraction method for MDPs, with a parameter estimation method originally developed for natural language processing, designed specifically to operate on small data. The combination expedites learning from small data and offers more accurate models that lend themselves to more effective decision-making. Numerical evidence in support of the approach is offered in a comparative study on a small grid-world example.

引用

页码：252 / 257

页数：6

共 50 条

[1] Learning Large Graph-Based MDPs With Historical Data
Haksar, Ravi N.
Schwager, Mac
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2022, 9 (03): : 1447 - 1458
[2] Learning sequential option hedging models from market data
Nian, Ke
Coleman, Thomas F.
Li, Yuying
JOURNAL OF BANKING & FINANCE, 2021, 133
[3] ε-MDPs:: Learning in varying environments
Szita, I
Takács, B
Lorincz, A
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) : 145 - 173
[4] Learning to Branch with Tree MDPs
Scavuzzo, Lara
Chen, Feng Yang
Chetelat, Didier
Gasse, Maxime
Lodi, Andrea
Yorke-Smith, Neil
Aardal, Karen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[5] Reinforcement learning for MDPs with constraints
Geibel, Peter
MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 646 - 653
[6] Option Pricing Using Machine Learning with Intraday Data of TAIEX Option
Wang, Chou-Wen
Wu, Chin-Wen
Chen, Po-Lin
HCI IN BUSINESS, GOVERNMENT AND ORGANIZATIONS, PT II, HCIBGO 2023, 2023, 14039 : 214 - 224
[7] Efficient reinforcement learning in factored MDPs
Kearns, M
Koller, D
IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 740 - 747
[8] Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
Sutton, RS
Precup, D
Singh, S
ARTIFICIAL INTELLIGENCE, 1999, 112 (1-2) : 181 - 211
[9] Multitask reinforcement learning on the distribution of MDPs
Tanaka, F
Yamamura, M
2003 IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, VOLS I-III, PROCEEDINGS, 2003, : 1108 - 1113
[10] Expedited Learning in MDPs with Side Information
Ornik, Melkior
Fu, Jie
Lauffer, Niklas T.
Perera, W. K.
Alshiekh, Mohammed
Ono, Masahiro
Topcu, Ufuk
2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 1941 - 1948

← 1 2 3 4 5 →