A multi-modal spatial-temporal model for accurate motion forecasting with visual fusion

被引:7
|
作者
Wang, Xiaoding [1 ,2 ]
Liu, Jianmin [1 ,2 ]
Lin, Hui [1 ,2 ]
Garg, Sahil [3 ]
Alrashoud, Mubarak [4 ]
机构
[1] Fujian Normal Univ, Coll Comp & Cyber Secur, 8 Xuefu South Rd, Fuzhou 350117, Fujian, Peoples R China
[2] Fujian Prov Univ, Engn Res Ctr Cyber Secur & Educ Informatizat, 8 Xuefu South Rd, Fuzhou 350117, Fujian, Peoples R China
[3] Ecole Technol Super, Elect Engn Dept, Montreal, PQ H3C 1K3, Canada
[4] King Saud Univ, Coll Comp & Informat Sci CCIS, Dept Software Engn SWE, Riyadh 11543, Saudi Arabia
关键词
Motion forecasting; Intelligent transportation; Spatial-temporal cross attention; Multi-source visual fusion;
D O I
10.1016/j.inffus.2023.102046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The multi-source visual information from ring cameras and stereo cameras provides a direct observation of the road, traffic conditions, and vehicle behavior. However, relying solely on visual information may not provide a complete environmental understanding. It is crucial for intelligent transportation systems to effectively utilize multi-source, multi-modal data to accurately predict the future motion trajectory of vehicles accurately. Therefore, this paper presents a new model for predicting multi-modal trajectories by integrating multi-source visual feature. A spatial-temporal cross attention fusion module is developed to capture the spatiotemporal interactions among vehicles, while leveraging the road's geographic structure to improve prediction accuracy. The experimental results on the realistic dataset Argoverse 2 demonstrate that, in comparison to other methods, ours improves the metrics of minADE (Minimum Average Displacement Error), minFDE (Minimum Final Displacement Error), and MR (Miss Rate) by 1.08%, 3.15%, and 2.14% , respectively, in unimodal prediction. In multimodal prediction, the improvements are 5.47%, 4.46%, and 6.50%. Our method effectively captures the temporal and spatial characteristics of vehicle movement trajectories, making it suitable for autonomous driving applications.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Stacked Multi-modal Refining and Fusion Network for Visual Entailment
    Yao, Yuan
    Hu, Min
    Wang, Xiaohua
    Liu, Chuqing
    THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
  • [22] Spatial-Temporal Graph Attention Model on Traffic Forecasting
    Zhang, Xinlan
    Zhang, Zhenguo
    Jin, Xiaofeng
    2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 999 - 1003
  • [23] Dynamic spatial-temporal model for carbon emission forecasting
    Gong, Mingze
    Zhang, Yongqi
    Li, Jia
    Chen, Lei
    JOURNAL OF CLEANER PRODUCTION, 2024, 463
  • [24] Interactive Fusion and Tracking For Multi-Modal Spatial Data Visualization
    Elshehaly, M.
    Gracanin, D.
    Gad, M.
    Elmongui, H. G.
    Matkovic, K.
    COMPUTER GRAPHICS FORUM, 2015, 34 (03) : 251 - 260
  • [25] Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion
    Qiu, Jianing
    Chen, Lipeng
    Gu, Xiao
    Lo, Frank P-W
    Tsai, Ya-Yen
    Sun, Jiankai
    Liu, Jiaqi
    Lo, Benny
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 8799 - 8806
  • [26] A Multi-modal Medical Image Fusion Method in Spatial Domain
    Yan, Huibin
    Li, Zhongmin
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 597 - 601
  • [27] Multi-modal Intermediate Fusion Model for diagnosis prediction
    Lu, You
    Niu, Ke
    Peng, Xueping
    Zeng, Jingni
    Pei, Su
    6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 38 - 43
  • [28] Spatial-temporal distribution and forecasting model of precipitation using dynamic-statistical information fusion
    Zhao, Jun
    Xu, Jinchao
    Wang, Guoqing
    Jin, Juliang
    Hu, Xiaojie
    Guo, Yan
    Li, Xuechun
    JOURNAL OF WATER AND CLIMATE CHANGE, 2022, 13 (03) : 1425 - 1447
  • [29] Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting
    Li, Mengzhang
    Zhu, Zhanxing
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 4189 - 4196
  • [30] Multi-modal spatial querying
    Egenhofer, MJ
    ADVANCES IN GIS RESEARCH II, 1997, : 785 - 799