Fully Convolutional Encoder-Decoder With an Attention Mechanism for Practical Pedestrian Trajectory Prediction

被引:10
|
作者
Chen, Kai [1 ]
Song, Xiao [2 ]
Yuan, Haitao [3 ]
Ren, Xiaoxiang [4 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Mech & Elect Engn, Nanjing 210016, Peoples R China
[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[3] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA
[4] Wendong New Dist Middle Sch, Lvliang 032100, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Trajectory; Predictive models; Feature extraction; Convolutional neural networks; Markov processes; Force; Convolution; Pedestrian behavior; convolution; long short-term memory (LSTM); attention mechanism;
D O I
10.1109/TITS.2022.3170874
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Pedestrian trajectory prediction using video is essential for many practical traffic applications. Most existing pedestrian trajectory prediction methods are based on fully connected long short-term memory (LSTM) networks and perform well on public datasets. However, these methods still have three defects: a) Most of them rely on manual annotations to obtain information about the environment surrounding the subject pedestrian, which limits practical applications; b) The interaction among pedestrians and obstacles in a scene is little studied, which leads to greater prediction error; c) Traditional LSTM methods are based on the previous moment and ignore the correlation between the future and distant past states of the pedestrian, which generates unrealistic trajectories. To tackle these problems, first, in the stage of data processing, we use an image semantic segmentation algorithm to obtain multi-category obstacle information and design an end-to-end ``Siamese Position Extraction'' model to obtain more accurate pedestrian interaction data. Second, we design an end-to-end fully convolutional LSTM encoder-decoder with an attention mechanism (FLEAM) to overcome the shortcomings of LSTM. Third, we compare FLEAM with several state-of-the-art LSTM-based prediction methods on multiple video sequences in the datasets ETH, UCY and MOT20. The results show that our approach generates the same prediction error as the best results of the state-of-the-art method. However, FLEAM has more potential for practice application because it does not rely on manually annotated data. We further validate the effectiveness of FLEAM by employing manually annotated data, finding that it generates much less prediction error.
引用
收藏
页码:20046 / 20060
页数:15
相关论文
共 50 条
  • [31] Skip-attention encoder-decoder framework for human motion prediction
    Zhang, Ruipeng
    Shu, Xiangbo
    Yan, Rui
    Zhang, Jiachao
    Song, Yan
    MULTIMEDIA SYSTEMS, 2022, 28 (02) : 413 - 422
  • [32] Integrating Convolutional Attention and Encoder-Decoder Long Short-Term Memory for Enhanced Soil Moisture Prediction
    Han, Jingfeng
    Hong, Jian
    Chen, Xiao
    Wang, Jing
    Zhu, Jinlong
    Li, Xiaoning
    Yan, Yuguang
    Li, Qingliang
    WATER, 2024, 16 (23)
  • [33] Short-term Inland Vessel Trajectory Prediction with Encoder-Decoder Models
    Donandt, Kathrin
    Boettger, Karim
    Soeffker, Dirk
    2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 974 - 979
  • [34] Uncertainty-Aware Recurrent Encoder-Decoder Networks for Vessel Trajectory Prediction
    Capobianco, Samuele
    Forti, Nicola
    Millefiori, Leonardo M.
    Braca, Paolo
    Willett, Peter
    2021 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2021, : 117 - 121
  • [35] Residual stacked gated recurrent unit with encoder-decoder architecture and an attention mechanism for temporal traffic prediction
    Kuo, R. J.
    Kunarsito, D. A.
    SOFT COMPUTING, 2022, 26 (17) : 8617 - 8633
  • [36] Deep Convolutional Symmetric Encoder-Decoder Neural Networks to Predict Students' Visual Attention
    Hachaj, Tomasz
    Stolinska, Anna
    Andrzejewska, Magdalena
    Czerski, Piotr
    SYMMETRY-BASEL, 2021, 13 (12):
  • [37] MEDA: Multi-output Encoder-Decoder for Spatial Attention in Convolutional Neural Networks
    Li, Huayu
    Razi, Abolfazl
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 2087 - 2091
  • [38] Attention-Based Encoder-Decoder Model for Photovoltaic Power Generation Prediction
    Zhu, Xiang
    Hu, Juntao
    Song, Liangcai
    Suo, Guilong
    Zhan, Yong
    5TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI2020), 2020, 1575
  • [39] Self-Attention based encoder-Decoder for multistep human density prediction
    Violos, John
    Theodoropoulos, Theodoros
    Maroudis, Angelos-Christos
    Leivadeas, Aris
    Tserpes, Konstantinos
    JOURNAL OF URBAN MOBILITY, 2022, 2
  • [40] Attention-Based Encoder-Decoder Network for Prediction of Electromagnetic Scattering Fields
    Zhang, Ying
    He, Mang
    2022 IEEE 10TH ASIA-PACIFIC CONFERENCE ON ANTENNAS AND PROPAGATION, APCAP, 2022,