Pedestrian Crossing Intention Prediction with Multi-Modal Transformer-Based Model

被引:0
|
作者
Wang, Ting Wei [1 ]
Lai, Shang-Hong [1 ,2 ]
机构
[1] Natl Tsing Hua Univ, Inst Informat Syst & Applicat, Hsinchu, Taiwan
[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
关键词
D O I
10.1109/APSIPAASC58517.2023.10317161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The popularity of autonomous driving and advanced driver assistance systems can potentially reduce thousands of car accidents and casualties. In particular, pedestrian prediction and protection is an urgent development priority for such systems. Prediction of pedestrians' intentions of crossing the road or their actions can help such systems to assess the risk of pedestrians in front of vehicles in advance. In this paper, we propose a multi-modal pedestrian crossing intention prediction framework based on the transformer model to provide a better solution. Our method takes advantage of the excellent sequential modeling capability of the Transformer, enabling the model to perform stably in this task. We also propose to represent traffic environment information in a novel way, allowing such information can be efficiently exploited. Moreover, we include the lifted 3D human pose and 3D head orientation information estimated from pedestrian image into the model prediction, allowing the model to understand pedestrian posture better. Finally, our experimental results show the proposed system provides state-of-the-art accuracy on benchmarking datasets.
引用
收藏
页码:1349 / 1356
页数:8
相关论文
共 50 条
  • [41] Personalized emotion analysis based on fuzzy multi-modal transformer model
    Liu, Jianbang
    Ang, Mei Choo
    Chaw, Jun Kit
    Ng, Kok Weng
    Kor, Ah-Lian
    APPLIED INTELLIGENCE, 2025, 55 (03)
  • [42] Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU
    Wanchaitanawong, Napat
    Tanaka, Masayuki
    Shibata, Takashi
    Okutomi, Masatoshi
    PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [43] An Improved Transformer-Based Model for Urban Pedestrian Detection
    Wu, Tianyong
    Li, Xiang
    Dong, Qiuxuan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2025, 18 (01)
  • [44] MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction
    Ham, Je-Seok
    Bae, Kangmin
    Moon, Jinyoung
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 13801 LNCS : 663 - 679
  • [45] Dynamical User Intention Prediction via Multi-modal Learning
    Liu, Xuanwu
    Li, Zhao
    Mao, Yuanhui
    Lai, Lixiang
    Gao, Ben
    Deng, Yao
    Yu, Guoxian
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 519 - 535
  • [46] Research on Pedestrian Crossing Intention Prediction Based on Deep Learning
    Huo, Chunbao
    Ma, Jie
    Tong, Zhibo
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 282 - 287
  • [47] A Vision Transformer-Based Framework for Knowledge Transfer From Multi-Modal to Mono-Modal Lymphoma Subtyping Models
    Guetarni, Bilel
    Windal, Feryal
    Benhabiles, Halim
    Petit, Marianne
    Dubois, Romain
    Leteurtre, Emmanuelle
    Collard, Dominique
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (09) : 5562 - 5572
  • [48] Multi-modal transformer with language modality distillation for early pedestrian action anticipation
    Osman, Nada
    Camporese, Guglielmo
    Ballan, Lamberto
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [49] Transformer-Based Multi-Modal Data Fusion Method for COPD Classification and Physiological and Biochemical Indicators Identification
    Xie, Weidong
    Fang, Yushan
    Yang, Guicheng
    Yu, Kun
    Li, Wei
    BIOMOLECULES, 2023, 13 (09)
  • [50] Pedestrian crossing intention prediction in the wild:A survey
    Yancheng Ling
    Zhenliang Ma
    Chain, 2024, 1 (04) : 263 - 279