Attentive Sequences Recurrent Network for Social Relation Recognition from Video

被引:4
|
作者
Lv, Jinna [1 ,2 ]
Wu, Bin [1 ]
Zhang, Yunlei [1 ]
Xiao, Yunpeng [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Beijing, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
social relation recognition; video analysis; deep learning; LSTM; attention mechanism;
D O I
10.1587/transinf.2019EDP7104
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, social relation analysis receives an increasing amount of attention from text to image data. However, social relation analysis from video is an important problem, which is lacking in the current literature. There are still some challenges: 1) it is hard to learn a satisfactory mapping function from low-level pixels to high-level social relation space; 2) how to efficiently select the most relevant information from noisy and unsegmented video. In this paper, we present an Attentive Sequences Recurrent Network model, called ASRN, to deal with the above challenges. First, in order to explore multiple clues, we design a Multiple Feature Attention (MFA) mechanism to fuse multiple visual features (i.e. image, motion, body, and face). Through this manner, we can generate an appropriate mapping function from low-level video pixels to high-level social relation space. Second, we design a sequence recurrent network based on Global and Local Attention (GLA) mechanism. Specially, an attention mechanism is used in GLA to integrate global feature with local sequence feature to select more relevant sequences for the recognition task. Therefore, the GLA module can better deal with noisy and unsegmented video. At last, extensive experiments on the SRIV dataset demonstrate the performance of our ASRN model.
引用
收藏
页码:2568 / 2576
页数:9
相关论文
共 50 条
  • [1] Attentive Relation Network for Object based Video Games
    Deng, Hangyu
    Luo, Jia
    Hu, Jinglu
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [2] Recurrent bayesian network for the recognition of human behaviors from video
    Moënne-Loccoz, N
    Brémond, F
    Thonnat, M
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2003, 2626 : 68 - 77
  • [3] VDARN: Video Disentangling Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition
    Su, Yong
    Xing, Meng
    An, Simin
    Peng, Weilong
    Feng, Zhiyong
    AD HOC NETWORKS, 2021, 113
  • [4] Attentive Recurrent Social Recommendation
    Sun, Peijie
    Wu, Le
    Wang, Meng
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 185 - 194
  • [5] Motion recognition from video sequences
    Yu, X
    Yang, SX
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, 2671 : 532 - 536
  • [6] Temporal Memory Relation Network for Workflow Recognition From Surgical Video
    Jin, Yueming
    Long, Yonghao
    Chen, Cheng
    Zhao, Zixu
    Dou, Qi
    Heng, Pheng-Ann
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (07) : 1911 - 1923
  • [7] Two-Stream Attention Network for Pain Recognition from Video Sequences
    Thiam, Patrick
    Kestler, Hans A.
    Schwenker, Friedhelm
    SENSORS, 2020, 20 (03)
  • [8] Video Action Recognition with Attentive Semantic Units
    Chen, Yifei
    Chen, Dapeng
    Liu, Ruijin
    Li, Hao
    Peng, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10136 - 10146
  • [9] Facial expression recognition from video sequences
    Cohen, I
    Sebe, N
    Garg, A
    Lew, MS
    Huang, TS
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A121 - A124
  • [10] A study of motion recognition from video sequences
    Yu, Xiang
    Yang, Simon X.
    COMPUTING AND VISUALIZATION IN SCIENCE, 2005, 8 (01) : 19 - 25