Spatial-temporal Graph Inference with Granger Causality Relation for Group Activity Analysis

被引:0
|
作者
Xie Z. [1 ,2 ,3 ]
Li J. [3 ]
Wu K.-W. [1 ,2 ,3 ]
Jiao C. [3 ]
机构
[1] Key Laboratory of Knowledge Engineering with Big Data, Hefei University of Technology, Ministry of Education, Hefei
[2] Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei University of Technology, Hefei
[3] School of Computer Science and Information Engineering, Hefei University of Technology, Hefei
来源
关键词
Granger causality relation; graph convolutional inference; group activity recognition; spatial-temporal context; temporal delay dependency;
D O I
10.11897/SP.J.1016.2023.00856
中图分类号
学科分类号
摘要
Causality reflects the directional effect from the active actor to the passive actor and commonly exists in group interactions. The difficulty in causality detection lies in the complex temporal dynamics of sequential features of the interacting actors. Existing methods use recurrent neural networks to describe the temporal dynamics of the interaction relations. Some methods use temporal attention mechanisms to describe temporal dependencies. They neglect to analyze the dependency between two actors, and are hard to distinguish the active actor and passive actor in the interaction. In this work,we design a Granger causality-based spatiotemporal graph model to learn the active-passive relations between interacting actors. To detect the Granger causality,the model designs an autoregression function for single individual temporal sequential features to describe the dependence of action on the individual itself. The model designs a correlative regression function for two individual temporal sequential features to describe the dependence of action on two individuals. The model detects the correlative individual as an active individual and the other as a passive individual by comparing the autoregressive error with the correlative regression error, when the autoregressive error is significantly larger than the correlative regression error, which indicates that the correlative individuals change the action of the other individual. The correlative regression function considers two individual temporal sequential features with multiple time delays,which can be used to learn the amount of time delay for actions between two individuals. This time delay amount is used to align the active individual time features with the passive individual time features. The temporally aligned active individual features provide the temporal and spatial contextual features of the passive individual and are fused with the passive individual features at the channel-wise level. The model constructs causal graphs of multi-scale spatiotemporal features to fully describe the interaction between appearance patterns, location constraints, and Granger causality among individuals. The multi-scale causal graph embeds the contextual features into the individual features and group features with graph inference. Experiments compare with state-of-the-art methods on Volleyball and Collective Activity datasets. (1)The spatial relation pooling model, such as Hierarchical Deep Temporal Model (HDTM). (2) The spatial relation graph models include Social Scene Understanding model (SSU), Convolutional Relational Machine (CRM), Hierarchical Relational Machine model (HRN), Actor Relation Graph model (ARG), Graph Attention Interaction Model (GAIM), Actor-Transformer (AT), Position Distribution and Appearance Relation model (PDAR), Multi-level Interaction Relation model(MLIR). (3)The spatial-temporal relation model includes Confidence-Energy Recurrent Network (CERN), spatial-temporal attentive graph network (stagNet), Progressive Relation Learning model (PRL), Graph LSTM-in-LSTM model (GLIL), Visual Context model (VC), GroupFormer (GF), Partial context embedding(PCE),Coherence Constrained Graph LSTM(CCGLSTM). Our Granger causality-based relation detector can describe the relations between potential active actors and passive actors. The channel-wise temporal causality graph inference module can enhance the feature of the passive actor by fusing a temporal delayed feature of the active actor. The graph model use Granger causality relation can describe effective interaction between actors and provide contextual features for group activity recognition. © 2023 Science Press. All rights reserved.
引用
收藏
页码:856 / 876
页数:20
相关论文
共 39 条
  • [1] Ding Li, Wensheng Zhang, Attentive pooling for group activity recognition, SCIENTIA SINICA Informationis, 51, pp. 399-412, (2021)
  • [2] Hangjie Yuan, Dong Ni, Learning visual context for group activity recognition, Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 3261-3269, (2021)
  • [3] Lihua Lu, Yao Lu, Ruizhe Yu, Huijun Di, Lin Zhang, Shunzhou Wang, GAIM: graph attention interaction model for collective activity recognition, IEEE Transactions on Multimedia, 22, 2, pp. 524-539, (2020)
  • [4] Lihua Lu, Yao Lu, Shunzhou Wang, Learning multi-level interaction relations and feature representations for group activity recognition, Proceedings of the 27th Conference on Multimedia Modeling, pp. 617-628, (2021)
  • [5] Ibrahim Mostafa S., Greg Mori, Hierarchical relational networks for group activity recognition and retrieval, Proceedings of the 2018 European Conference on Computer Vision, pp. 742-758, (2018)
  • [6] Kirill Gavrilyuk, Ryan Sanford, Mehrsan Javan, Snoek Cees G. M., Actor-transformers for group activity recognition, Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition, pp. 836-845, (2020)
  • [7] Duoxuan Pei, Annan Li, Yunhong Wang, Group activity recognition by exploiting position distribution and appearance relation, Proceedings of the 27th Conference on Multimedia Modeling, pp. 123-135, (2021)
  • [8] Mengshi Qi, Yunhong Wang, Jie Qin, Annan Li, Jiebo Luo, Van Gool Luc, StagNet:an attentive semantic RNN for group activity and individual action recognition, IEEE Transactions on Circuits and Systems for Video Technology, 30, 2, pp. 549-565, (2020)
  • [9] Guyue Hu, Bo Cui, Yuan He, Shan Yu, Progressive relation learning for group activity recognition, Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition, pp. 977-986, (2020)
  • [10] Xiangbo Shu, Liyan Zhang, Yunlian Sun, Jinhui Tang, Host-parasite:graph LSTM-in-LSTM for group activity recognition, IEEE Transactions on Neural Networks and Learning Systems, 32, 2, pp. 663-674, (2021)