Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model

被引:0
|
作者
Wang, Ting-Wei [1 ]
Lai, Shang-Hong [1 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
关键词
Pedestrian crossing intention prediction; multi-modal learning; transformer model; human posture;
D O I
10.1561/116.20240019
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pedestrian crossing intention prediction based on computer vision plays a pivotal role in enhancing the safety of autonomous driving and advanced driver assistance systems. In this paper, we present a novel multi-modal pedestrian crossing intention prediction framework leveraging the transformer model. By integrating diverse sources of information and leveraging the transformer's sequential modeling and parallelization capabilities, our system accurately predicts pedestrian crossing intentions. We introduce a novel representation of traffic environment data and incorporate lifted 3D human pose and head orientation data to enhance the model's understanding of pedestrian behavior. Experimental results demonstrate the state-of-the-art accuracy of our proposed system on benchmark datasets.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Pedestrian Crossing Intention Prediction Method Based on Multi-Feature Fusion
    Ma, Jun
    Rong, Wenhui
    WORLD ELECTRIC VEHICLE JOURNAL, 2022, 13 (08):
  • [22] Multi-Modal Hybrid Architecture for Pedestrian Action Prediction
    Rasouli, Amir
    Yau, Tiffany
    Rohani, Mohsen
    Luo, Jun
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 91 - 97
  • [23] Multi information pedestrian crossing intention prediction based on mixed attention mechanism
    Sang, Hai-Feng
    Liu, Yu-Long
    Liu, Quan-Kai
    Kongzhi yu Juece/Control and Decision, 2024, 39 (12): : 3946 - 3954
  • [24] Representation, Alignment, Fusion: A Generic Transformer-Based Framework for Multi-modal Glaucoma Recognition
    Zhou, You
    Yang, Gang
    Zhou, Yang
    Ding, Dayong
    Zhao, Jianchun
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII, 2023, 14226 : 704 - 713
  • [25] A Transformer-based multi-modal fusion network for 6D pose estimation
    Hong, Jia-Xin
    Zhang, Hong-Bo
    Liu, Jing-Hua
    Lei, Qing
    Yang, Li-Jie
    Du, Ji-Xiang
    INFORMATION FUSION, 2024, 105
  • [26] Tile Classification Based Viewport Prediction with Multi-modal Fusion Transformer
    Zhang, Zhihao
    Chen, Yiwei
    Zhang, Weizhan
    Yan, Caixia
    Zheng, Qinghua
    Wang, Qi
    Chen, Wangdu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3560 - 3568
  • [27] Pedestrian Detection Based on Multi-modal Cooperation
    Zhang, Yan-ning
    Tong, Xiao-min
    Zhang, Xiu-wei
    Zheng, Jiang-bin
    Zhou, Jun
    You, Si-wei
    2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 151 - +
  • [28] Multi-modal Intention Prediction with Probabilistic Movement Primitives
    Dermy, Oriane
    Charpillet, Francois
    Ivaldi, Serena
    HUMAN FRIENDLY ROBOTICS, 2019, 7 : 181 - 196
  • [29] MULTI-VIEW AND MULTI-MODAL EVENT DETECTION UTILIZING TRANSFORMER-BASED MULTI-SENSOR FUSION
    Yasuda, Masahiro
    Ohishi, Yasunori
    Saito, Shoichiro
    Harado, Noboru
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4638 - 4642
  • [30] Multi-modal pedestrian detection with misalignment based on modal-wise regression and multi-modal IoU
    Wanchaitanawong, Napat
    Tanaka, Masayuki
    Shibata, Takashi
    Okutomi, Masatoshi
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)