A benchmark dataset and semantics-guided detection network for spatial–temporal human actions in urban driving scenes

被引:0
|
作者
Zhong, Fujin [1 ,2 ,3 ]
Wu, Yini [1 ]
Yu, Hong [1 ,2 ,3 ]
Wang, Guoyin [2 ,3 ]
Lu, Zhantao [1 ]
机构
[1] School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
[2] Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing, China
[3] Key Laboratory of Cyberspace Big Data Intelligent Security, Ministry of Education, China
基金
中国国家自然科学基金;
关键词
D O I
10.1016/j.patcog.2024.111035
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
In real urban driving scenes, human actions are very complex and have the characteristic of multiple concurrent actions. It has a great significance to detect human actions in urban traffic scenes for auxiliary or autonomous driving systems. In this view, we introduce the TITAN-Human Action dataset for the task of multi-person spatial–temporal action detection in urban driving scenes. TITAN-Human Action provides the fine-grained action labels and location coordinates for 17,574 persons in the processed frames from the TITAN dataset. Furthermore, we propose a semantics-guided detection network (SGDNet) based on a semantic inference module (SIM) for spatial–temporal human action detection in urban driving scenes. SIM encodes the category labels into sentence vectors at the semantic level with prompting and embedding, utilizes graphs to represent the directed co-occurrence relations between categories, and adopts the graph convolutional network for semantic inference. SGDNet exploits the inference results of SIM to guide the visual branch in better performing human action detection, thereby achieving the integration of visual and linguistic information. We conducted experiments to evaluate SGDNet and several baseline methods on the TITAN-Human Action dataset, and reveal the generalizability of SIM in spatial–temporal human action detection. The source code and annotation files will be available at https://github.com/yyhbswyn/SGDNet. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 17 条
  • [1] Alignment-Free RGBT Salient Object Detection: Semantics-Guided Asymmetric Correlation Network and a Unified Benchmark
    Wang K.
    Lin D.
    Li C.
    Tu Z.
    Luo B.
    IEEE Transactions on Multimedia, 2024, 26 : 1 - 16
  • [2] Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction
    Rainbow, Ben A.
    Men, Qianhui
    Shum, Hubert P. H.
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2959 - 2966
  • [3] Semantics-Guided Contrastive Network for Zero-Shot Object Detection
    Yan, Caixia
    Chang, Xiaojun
    Luo, Minnan
    Liu, Huan
    Zhang, Xiaoqin
    Zheng, Qinghua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1530 - 1544
  • [4] Learning Spatial and Temporal Extents of Human Actions for Action Detection
    Zhou, Zhong
    Shi, Feng
    Wu, Wei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (04) : 512 - 525
  • [5] Temporal Semantics Auto-encoding based Moving Objects Detection in Urban Driving Scenario
    Lateef, Fahad
    Kas, Mohamed
    Ruichek, Yassine
    2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 1352 - 1358
  • [6] Skeletal Spatial-Temporal Semantics Guided Homogeneous-Heterogeneous Multimodal Network for Action Recognition
    Zhang, Chenwei
    Hu, Yuxuan
    Yang, Min
    Li, Chengming
    Hu, Xiping
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3657 - 3666
  • [7] Spatial-Temporal Evolution Guided Change Detection Network for Remote Sensing Images
    Wang, Qingwang
    Hong, Zheng
    Huang, Jiangbo
    Zhao, Xiaobin
    Song, Jian
    Zeng, Kai
    Shi, Jianwu
    Shen, Tao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 14080 - 14092
  • [8] Human Action Recognition for Dynamic Scenes of Emergency Rescue Based on Spatial-Temporal Fusion Network
    Zhang, Yongmei
    Guo, Qian
    Du, Zhirong
    Wu, Aiyan
    ELECTRONICS, 2023, 12 (03)
  • [9] Spatial temporal database model for detection and circumvention of traffic congestion in urban transportation network: Game theoretic approach
    Purohit, Seema
    Mantri, Shruti
    International Journal of Computer Science and Applications, 2014, 11 (02) : 73 - 92
  • [10] Spatial-Temporal Patterns of Network Structure of Human Settlements Competitiveness in Resource-Based Urban Agglomerations
    Yu, Wenbo
    Yang, Jun
    Sun, Dongqi
    Yu, Huisheng
    Yao, Yao
    Xiao, Xiangming
    Xia, Jianhong
    FRONTIERS IN ENVIRONMENTAL SCIENCE, 2022, 10