Fully Convolutional Encoder-Decoder With an Attention Mechanism for Practical Pedestrian Trajectory Prediction

被引:10
|
作者
Chen, Kai [1 ]
Song, Xiao [2 ]
Yuan, Haitao [3 ]
Ren, Xiaoxiang [4 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Mech & Elect Engn, Nanjing 210016, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA
[4] Wendong New Dist Middle Sch, Lvliang 032100, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Trajectory; Predictive models; Feature extraction; Convolutional neural networks; Markov processes; Force; Convolution; Pedestrian behavior; convolution; long short-term memory (LSTM); attention mechanism;
D O I
10.1109/TITS.2022.3170874
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are based on fully connected long short-term memory (LSTM) networks and perform well on public datasets. However, these methods still have three defects: a) Most of them rely on manual annotations to obtain information about the environment surrounding the subject pedestrian, which limits practical applications; b) The interaction among pedestrians and obstacles in a scene is little studied, which leads to greater prediction error; c) Traditional LSTM methods are based on the previous moment and ignore the correlation between the future and distant past states of the pedestrian, which generates unrealistic trajectories. To tackle these problems, first, in the stage of data processing, we use an image semantic segmentation algorithm to obtain multi-category obstacle information and design an end-to-end ``Siamese Position Extraction'' model to obtain more accurate pedestrian interaction data. Second, we design an end-to-end fully convolutional LSTM encoder-decoder with an attention mechanism (FLEAM) to overcome the shortcomings of LSTM. Third, we compare FLEAM with several state-of-the-art LSTM-based prediction methods on multiple video sequences in the datasets ETH, UCY and MOT20. The results show that our approach generates the same prediction error as the best results of the state-of-the-art method. However, FLEAM has more potential for practice application because it does not rely on manually annotated data. We further validate the effectiveness of FLEAM by employing manually annotated data, finding that it generates much less prediction error.
引用
收藏
页码:20046 / 20060
页数:15
相关论文
共 50 条
  • [1] Pedestrian behavior prediction model with a convolutional LSTM encoder-decoder
    Chen, Kai
    Song, Xiao
    Han, Daolin
    Sun, Jinghan
    Cui, Yong
    Ren, Xiaoxiang
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 560 (560)
  • [2] Pedestrian trajectory prediction using BiRNN encoder-decoder framework*
    Wu, Jiaxu
    Woo, Hanwool
    Tamura, Yusuke
    Moro, Alessandro
    Massaroli, Stefano
    Yamashita, Atsushi
    Asama, Hajime
    ADVANCED ROBOTICS, 2019, 33 (18) : 956 - 969
  • [3] Pedestrian Trajectory Prediction Using RNN Encoder-Decoder with SpatioTemporal Attentions
    Bhujel, Niraj
    Yau, Wei-Yun
    Teoh, Eam Khwang
    2019 IEEE 5TH INTERNATIONAL CONFERENCE ON MECHATRONICS SYSTEM AND ROBOTS (ICMSR 2019), 2019, : 110 - 114
  • [4] Prediction of Pedestrian Trajectory in a Crowded Environment Using RNN Encoder-Decoder
    Xiong Xincheng
    Bhujel, Niraj
    Teoh, Eam Khwang
    Yau, Wei-Yun
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE, ICRAI 2019, 2019, : 64 - 69
  • [5] Crossing-Road Pedestrian Trajectory Prediction via Encoder-Decoder LSTM
    Xue, Peixin
    Liu, Jianyi
    Chen, Shitao
    Zhou, Zhuoli
    Huo, Yongbo
    Zheng, Nanning
    2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 2027 - 2033
  • [6] Pedestrian Trajectory Prediction in Heterogeneous Traffic Using Pose Keypoints-Based Convolutional Encoder-Decoder Network
    Chen, Kai
    Song, Xiao
    Ren, Xiaoxiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1764 - 1775
  • [7] Pedestrian Trajectory Prediction in Heterogeneous Traffic using Facial Keypoints-based Convolutional Encoder-decoder Network
    Xiao, Song
    Chen, Kai
    Ren, Xiaoxiang
    Yuan, Haitao
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2022, 22 (04)
  • [8] Destination intention estimation-based convolutional encoder-decoder for pedestrian trajectory multimodality forecast
    Wang, Ruiping
    Lam, Siew-Kei
    Wu, Meiqing
    Hu, Zhijian
    Wang, Changshuo
    Wang, Jing
    MEASUREMENT, 2025, 239
  • [9] GPS Trajectory Completion Using End-to-End Bidirectional Convolutional Recurrent Encoder-Decoder Architecture with Attention Mechanism
    Nawaz, Asif
    Huang, Zhiqiu
    Wang, Senzhang
    Akbar, Azeem
    AlSalman, Hussain
    Gumaei, Abdu
    SENSORS, 2020, 20 (18) : 1 - 16
  • [10] Object Contour Detection with a Fully Convolutional Encoder-Decoder Network
    Yang, Jimei
    Price, Brian
    Cohen, Scott
    Lee, Honglak
    Yang, Ming-Hsuan
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 193 - 202