Attentive Sequences Recurrent Network for Social Relation Recognition from Video

被引:4
|
作者
Lv, Jinna [1 ,2 ]
Wu, Bin [1 ]
Zhang, Yunlei [1 ]
Xiao, Yunpeng [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Beijing, Peoples R China
[3] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
social relation recognition; video analysis; deep learning; LSTM; attention mechanism;
D O I
10.1587/transinf.2019EDP7104
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, social relation analysis receives an increasing amount of attention from text to image data. However, social relation analysis from video is an important problem, which is lacking in the current literature. There are still some challenges: 1) it is hard to learn a satisfactory mapping function from low-level pixels to high-level social relation space; 2) how to efficiently select the most relevant information from noisy and unsegmented video. In this paper, we present an Attentive Sequences Recurrent Network model, called ASRN, to deal with the above challenges. First, in order to explore multiple clues, we design a Multiple Feature Attention (MFA) mechanism to fuse multiple visual features (i.e. image, motion, body, and face). Through this manner, we can generate an appropriate mapping function from low-level video pixels to high-level social relation space. Second, we design a sequence recurrent network based on Global and Local Attention (GLA) mechanism. Specially, an attention mechanism is used in GLA to integrate global feature with local sequence feature to select more relevant sequences for the recognition task. Therefore, the GLA module can better deal with noisy and unsegmented video. At last, extensive experiments on the SRIV dataset demonstrate the performance of our ASRN model.
引用
收藏
页码:2568 / 2576
页数:9
相关论文
共 50 条
  • [31] Recurrent Region Attention and Video Frame Attention Based Video Action Recognition Network Design
    Sang H.-F.
    Zhao Z.-Y.
    He D.-K.
    Zhao, Zi-Yu (Maikuraky1022@outlook.com), 1600, Chinese Institute of Electronics (48): : 1052 - 1061
  • [32] Temporal-attentive Covariance Pooling Networks for Video Recognition
    Gao, Zilin
    Wang, Qilong
    Zhang, Bingbing
    Hu, Qinghua
    Li, Peihua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [33] TEMPORAL PYRAMID RELATION NETWORK FOR VIDEO-BASED GESTURE RECOGNITION
    Yang, Ke
    Li, Rongchun
    Qiao, Peng
    Wang, Qiang
    Li, Dongsheng
    Dou, Yong
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3104 - 3108
  • [34] Boosting Descriptors Condensed from Video Sequences for Place Recognition
    Chin, Tat-Jun
    Goh, Hanlin
    Lim, Joo-Hwee
    2008 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, VOLS 1-3, 2008, : 1386 - 1393
  • [35] Appearance-based pain recognition from video sequences
    Monwar, Md. Maruf
    Rezaei, Siamak
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 2429 - +
  • [36] Incremental recognition of traffic situations from video image sequences
    Haag, M
    Nagel, HH
    IMAGE AND VISION COMPUTING, 2000, 18 (02) : 137 - 153
  • [37] Face recognition from spatially-morphed video sequences
    Sebastiao, R.
    Silva, Jorge A.
    Padilha, A. J.
    IMAGE ANALYSIS AND RECOGNITION, PT 2, 2006, 4142 : 365 - 374
  • [38] Recognition of Mexican Sign Language from Frames in Video Sequences
    Cervantes, Jair
    Garcia-Lamont, Farid
    Rodriguez-Mazahua, Lisbeth
    Yee Rendon, Arturo
    Lopez Chau, Asdrubal
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT II, 2016, 9772 : 353 - 362
  • [39] Activity recognition from video sequences using declarative models
    Rota, NA
    Thonnat, M
    ECAI 2000: 14TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, 54 : 673 - 677
  • [40] Social Attentive Network for Live Stream Recommendation
    Yu, Dung-Ru
    Chu, Chiao-Chuan
    Lai, Hsu-Chao
    Huang, Jiun-Long
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 24 - 25