Pedestrian Crossing Intention Prediction with Multi-Modal Transformer-Based Model

被引:0
|
作者
Wang, Ting Wei [1 ]
Lai, Shang-Hong [1 ,2 ]
机构
[1] Natl Tsing Hua Univ, Inst Informat Syst & Applicat, Hsinchu, Taiwan
[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
关键词
D O I
10.1109/APSIPAASC58517.2023.10317161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The popularity of autonomous driving and advanced driver assistance systems can potentially reduce thousands of car accidents and casualties. In particular, pedestrian prediction and protection is an urgent development priority for such systems. Prediction of pedestrians' intentions of crossing the road or their actions can help such systems to assess the risk of pedestrians in front of vehicles in advance. In this paper, we propose a multi-modal pedestrian crossing intention prediction framework based on the transformer model to provide a better solution. Our method takes advantage of the excellent sequential modeling capability of the Transformer, enabling the model to perform stably in this task. We also propose to represent traffic environment information in a novel way, allowing such information can be efficiently exploited. Moreover, we include the lifted 3D human pose and 3D head orientation information estimated from pedestrian image into the model prediction, allowing the model to understand pedestrian posture better. Finally, our experimental results show the proposed system provides state-of-the-art accuracy on benchmarking datasets.
引用
收藏
页码:1349 / 1356
页数:8
相关论文
共 50 条
  • [31] MULTI-VIEW AND MULTI-MODAL EVENT DETECTION UTILIZING TRANSFORMER-BASED MULTI-SENSOR FUSION
    Yasuda, Masahiro
    Ohishi, Yasunori
    Saito, Shoichiro
    Harado, Noboru
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4638 - 4642
  • [32] Multi-modal pedestrian detection with misalignment based on modal-wise regression and multi-modal IoU
    Wanchaitanawong, Napat
    Tanaka, Masayuki
    Shibata, Takashi
    Okutomi, Masatoshi
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [33] Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction
    Bae, Inhwan
    Park, Jin-Hwi
    Jeon, Hae-Gon
    COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 : 270 - 289
  • [34] Multi-modal long document classification based on Hierarchical Prompt and Multi-modal Transformer
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Wang, Jiapu
    Sun, Yanfeng
    Yin, Baocai
    NEURAL NETWORKS, 2024, 176
  • [35] An intention-based multi-modal trajectory prediction framework for overtaking maneuver
    Zhang, Mingfang
    Liu, Ying
    Li, Huajian
    Wang, Li
    Wang, Pangwei
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 2850 - 2855
  • [36] FNR: a similarity and transformer-based approach to detect multi-modal fake news in social media
    Ghorbanpour, Faeze
    Ramezani, Maryam
    Fazli, Mohammad Amin
    Rabiee, Hamid R.
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 13 (01)
  • [37] FNR: a similarity and transformer-based approach to detect multi-modal fake news in social media
    Faeze Ghorbanpour
    Maryam Ramezani
    Mohammad Amin Fazli
    Hamid R. Rabiee
    Social Network Analysis and Mining, 13
  • [38] SERVER: Multi-modal Speech Emotion Recognition using Transformer-based and Vision-based Embeddings
    Nhat Truong Pham
    Duc Ngoc Minh Dang
    Bich Ngoc Hong Pham
    Sy Dzung Nguyen
    PROCEEDINGS OF 2023 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2023, 2023, : 234 - 238
  • [39] UniTR: A Unified TRansformer-Based Framework for Co-Object and Multi-Modal Saliency Detection
    Guo, Ruohao
    Ying, Xianghua
    Qi, Yanyu
    Qu, Liao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7622 - 7635
  • [40] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
    Zhang, Menghui
    Zhang, Yuchen
    Liu, Shuaibing
    Han, Yahui
    Cao, Honggang
    Qiao, Bingbing
    SCIENTIFIC REPORTS, 2024, 14 (01):