SAN: Structure-aware attention network for dyadic human relation recognition in images

被引:0
|
作者
Kaen Kogashi
Shohei Nobuhara
Ko Nishino
机构
[1] Kyoto University,Department of Intelligence Science and Technology, Graduate School of Informatics
来源
关键词
Dyadic human relation recognition (DHR); DHR dataset; Multi-task learning;
D O I
暂无
中图分类号
学科分类号
摘要
We introduce a new dataset and method for Dyadic Human relation Recognition (DHR). DHR is a new task that concerns the recognition of the type (i.e., verb) and roles of a two-person interaction. Unlike past human action detection, our goal is to extract richer information regarding the roles of actors, i.e., which subjective person is acting on which objective person. For this, we introduce the DHR-WebImages dataset which consists of a total of 22,046 images of 51 verb classes of DHR with per-image annotation of the verb and role, and also a test set for evaluating generalization capabilities, which we refer to as DHR-Generalization. We tackle DHR by introducing a novel network inspired by the hierarchical nature of cognitive human perception. At the core of the network lies a “structure-aware attention” module that weights and integrates various hierarchical visual cues associated with the DHR instance in the image. The feature hierarchy consists of three levels, namely the union, human, and joint levels, each of which extracts visual features relevant to the participants while modeling their cross-talk. We refer to this network as Structure-aware Attention Network (SAN). Experimental results show that SAN achieves accurate DHR robust to lacking visibility of actors, and outperforms past methods by 3.04 mAP on DHR-WebImages verb task.
引用
收藏
页码:46947 / 46966
页数:19
相关论文
共 50 条
  • [41] Structure-Aware Residual Pyramid Network for Monocular Depth Estimation
    Chen, Xiaotian
    Chen, Xuejin
    Zha, Zheng-Jun
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 694 - 700
  • [42] Structure-aware Mathematical Expression Recognition with Sequence-Level Modeling
    Li, Minli
    Zhao, Peilin
    Zhang, Yifan
    Niu, Shuaicheng
    Wu, Qingyao
    Tan, Mingkui
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5038 - 5046
  • [43] Structure-aware decoupled imputation network for multivariate time series
    Nourhan Ahmed
    Lars Schmidt-Thieme
    Data Mining and Knowledge Discovery, 2024, 38 : 1006 - 1026
  • [44] A structure-aware splitting framework for separating cell clumps in biomedical images
    Zhang, Qiang
    Wang, Jinghan
    Liu, Zaihao
    Zhang, Dingwen
    SIGNAL PROCESSING, 2020, 168
  • [45] SAMGAT: structure-aware multilevel graph attention networks for automatic rumor detection
    Li, Yafang
    Chu, Zhihua
    Jia, Caiyan
    Zu, Baokai
    PeerJ Computer Science, 2024, 10
  • [46] SAMGAT: structure-aware multilevel graph attention networks for automatic rumor detection
    Li, Yafang
    Chu, Zhihua
    Jia, Caiyan
    Zu, Baokai
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [47] Global Structure-Aware Drum Transcription Based on Self-Attention Mechanisms
    Ishizuka, Ryoto
    Nishikimi, Ryo
    Yoshii, Kazuyoshi
    SIGNALS, 2021, 2 (03): : 508 - 526
  • [48] Attentional Graph Convolutional Network for Structure-Aware Audiovisual Scene Classification
    Zhou, Liguang
    Zhou, Yuhongze
    Qi, Xiaonan
    Hu, Junjie
    Lam, Tin Lun
    Xu, Yangsheng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [49] Multi-Band NIR Colorization Using Structure-Aware Network
    Park, Min-Je
    Lee, Ju-Han
    Lee, Sang-Ho
    Kim, Jong-Ok
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1682 - 1686
  • [50] Structure-aware human pose estimation with graph convolutional networks
    Bin, Yanrui
    Chen, Zhao-Min
    Wei, Xiu-Shen
    Chen, Xinya
    Gao, Changxin
    Sang, Nong
    PATTERN RECOGNITION, 2020, 106