Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion

被引：6

作者：

Qiu, Jianing ^{[1
]}

Chen, Lipeng ^{[2
]}

Gu, Xiao ^{[1
]}

Lo, Frank P-W ^{[1
]}

Tsai, Ya-Yen ^{[1
]}

Sun, Jiankai ^{[2
,3
]}

Liu, Jiaqi ^{[2
,4
]}

Lo, Benny ^{[1
]}

机构：

[1] Imperial Coll London, Hamlyn Ctr Robot Surg, London SW7 2AZ, England

[2] Tencent Robot X, Shenzhen 518057, Peoples R China

[3] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA

[4] Shanghai Jiao Tong Univ, Inst Med Robot, Shanghai 200240, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 04期

关键词：

Human trajectory forecasting; egocentric vision; multi-modal learning;

D O I：

10.1109/LRA.2022.3188101

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

In this letter, we address the problem of forecasting the trajectory of an egocentric camera wearer (ego-person) in crowded spaces. The trajectory forecasting ability learned from the data of different camera wearers walking around in the real world can he transferred to assist visually impaired people in navigation, as well as to instill human navigation behaviours in mobile robots, enabling better human-robot interactions. To this end, a novel egocentric human trajectory forecasting dataset was constructed, containing real trajectories of people navigating in crowded spaces wearing a camera, as well as extracted rich contextual data. We extract and utilize three different modalities to forecast the trajectory of the camera wearer, i.e., his/her past trajectory, the past trajectories of nearby people, and the environment such as the scene semantics or the depth of the scene. A Transformer-based encoder-decoder neural network model, integrated with a novel cascaded cross-attention mechanism that fuses multiple modalities, has been designed to predict the future trajectory of the camera wearer. Extensive experiments have been conducted, with results showing that our model outperforms the state-of-the-art methods in egocentric human trajectory forecasting.

引用

页码：8799 / 8806

页数：8

共 50 条

[41] Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction
Nishimura, Haruki
Ivanovic, Boris
Gaidon, Adrien
Pavone, Marco
Schwager, Mac
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 11205 - 11212
[42] Solar Fusion Net: Enhanced Solar Irradiance Forecasting via Automated Multi-Modal Feature Selection and Cross-Modal Fusion
Jing, Tao
Chen, Shanlin
Navarro-Alarcon, David
Chu, Yinghao
Li, Mengying
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2025, 16 (02) : 761 - 773
[43] Real-Time Control Strategy of Exoskeleton Locomotion Trajectory Based on Multi-modal Fusion
Tao Zhen
Lei Yan
Journal of Bionic Engineering, 2023, 20 : 2670 - 2682
[44] Real-Time Control Strategy of Exoskeleton Locomotion Trajectory Based on Multi-modal Fusion
Zhen, Tao
Yan, Lei
JOURNAL OF BIONIC ENGINEERING, 2023, 20 (06) : 2670 - 2682
[45] On Multi-modal Fusion for Freehand Gesture Recognition
Schak, Monika
Gepperth, Alexander
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 862 - 873
[46] Improved Sentiment Classification by Multi-modal Fusion
Gan, Lige
Benlamri, Rachid
Khoury, Richard
2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 11 - 16
[47] On Multi-modal Fusion Learning in constraint propagation
Li, Yaoyi
Lu, Hongtao
INFORMATION SCIENCES, 2018, 462 : 204 - 217
[48] A FLEXIBLE TRAJECTORY COMPRESSION ALGORITHM FOR MULTI-MODAL TRANSPORTATION
Mirvahabi, S. S.
Abbaspour, R. Ali
Claramunt, C.
ISPRS GEOSPATIAL CONFERENCE 2022, JOINT 6TH SENSORS AND MODELS IN PHOTOGRAMMETRY AND REMOTE SENSING, SMPR/4TH GEOSPATIAL INFORMATION RESEARCH, GIRESEARCH CONFERENCES, VOL. 10-4, 2023, : 501 - 508
[49] A Lightweight Multi-Modal Vehicle Trajectory Prediction Algorithm
Li Z.
Sun H.
Hao Z.
Xiao D.
Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2024, 58 (06): : 14 - 23
[50] MULTI-MODAL RECURRENT FUSION FOR INDOOR LOCALIZATION
Yu, Jianyuan
Wang, Pu
Koike-Akino, Toshiaki
Orlik, Philip, V
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 5083 - 5087

← 1 2 3 4 5 →