Online Learning of Feasible Strategies in Unknown Environments

被引:0
|
作者
Paternain, Santiago [1 ]
Ribeiro, Alejandro [1 ]
机构
[1] Univ Penn, Dept Elect & Syst Engn, 200 S 33rd St, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An environment is defined as a set of constraint functions that vary arbitrarily over time. An agent wants to select feasible actions that keep all the constraints negative, but must do so causally. I.e., the dynamical system that determines actions is such that only their time derivatives can depend on the current constraints. An environment is said viable if there exists an action that can satisfy the constraints for all times. The fit of a trajectory is defined as a vector that integrates the constraint violations over time and is used to measure the extent to which a policy succeeds in learning feasible actions. An online saddle point controller is proposed to control fit and shown to do so under minimal technical conditions. The online saddle point controller pushes actions along a linear combination of the constraint negative gradients and dynamically adapts the coefficients of this linear combination to find appropriate weightings. Concepts are illustrated throughout with the problem of a shepherd that wants to stay close to all sheep in a herd. Numerical experiments show that the controller allows the shepherd to do so.
引用
收藏
页码:4231 / 4238
页数:8
相关论文
共 50 条
  • [31] Learning Autonomous Navigation in Unmapped and Unknown Environments
    He, Naifeng
    Yang, Zhong
    Bu, Chunguang
    Fan, Xiaoliang
    Wu, Jiying
    Sui, Yaoyu
    Que, Wenqiang
    [J]. SENSORS, 2024, 24 (18)
  • [32] Learning to Act for Perceiving in Partially Unknown Environments
    Lamanna, Leonardo
    Faridghasemnia, Mohamadreza
    Gerevini, Alfonso
    Saetti, Alessandro
    Saffiotti, Alessandro
    Serafini, Luciano
    Traverso, Paolo
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5485 - 5493
  • [33] Self-regulated learning strategies & academic achievement in online higher education learning environments: A systematic review
    Broadbent, J.
    Poon, W. L.
    [J]. INTERNET AND HIGHER EDUCATION, 2015, 27 : 1 - 13
  • [34] Online algorithms with discrete visibility - Exploring unknown polygonal environments
    Ghosh, Subir Kumar
    Burdick, Joel Wakeman
    Bhattacharya, Amitava
    Sarkar, Sudeep
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2008, 15 (02) : 67 - 76
  • [35] Strategies for developing Online Learning
    韩爱庆
    沈俊辉
    张未未
    王丽
    [J]. 科技信息, 2012, (28) : 157 - 158
  • [36] An online complete coverage approach for a team of robots in unknown environments
    Hoang Huu Viet
    Choi, SeungYoon
    Chung, TaeChoong
    [J]. 2013 13TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2013), 2013, : 929 - 934
  • [37] Online and Consistent Occupancy Grid Mapping for Planning in Unknown Environments
    Sodhi, Paloma
    Ho, Bing-Jui
    Kaess, Michael
    [J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 7879 - 7886
  • [38] Online learners and their learning strategies
    Dewar, T
    Whittington, D
    [J]. JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2000, 23 (04) : 385 - 403
  • [39] Strategies for Success in Online Learning
    Cantrell, Shirley W.
    O'Leary, Patricia
    Ward, Karen S.
    [J]. NURSING CLINICS OF NORTH AMERICA, 2008, 43 (04) : 547 - +
  • [40] LOCALLY ADAPTIVE ONLINE TRAJECTORY OPTIMIZATION IN UNKNOWN ENVIRONMENTS WITH RRTS
    Evans, Ethan N.
    Meyer, Patrick
    Seifert, Samuel
    Mavris, Dimitri N.
    Theodorou, Evangelos A.
    [J]. PROCEEDINGS OF THE ASME 11TH ANNUAL DYNAMIC SYSTEMS AND CONTROL CONFERENCE, 2018, VOL 3, 2018,