SAN: Structure-aware attention network for dyadic human relation recognition in images

被引：0

作者：

Kaen Kogashi

Shohei Nobuhara

Ko Nishino

机构：

[1] Kyoto University,Department of Intelligence Science and Technology, Graduate School of Informatics

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Dyadic human relation recognition (DHR); DHR dataset; Multi-task learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We introduce a new dataset and method for Dyadic Human relation Recognition (DHR). DHR is a new task that concerns the recognition of the type (i.e., verb) and roles of a two-person interaction. Unlike past human action detection, our goal is to extract richer information regarding the roles of actors, i.e., which subjective person is acting on which objective person. For this, we introduce the DHR-WebImages dataset which consists of a total of 22,046 images of 51 verb classes of DHR with per-image annotation of the verb and role, and also a test set for evaluating generalization capabilities, which we refer to as DHR-Generalization. We tackle DHR by introducing a novel network inspired by the hierarchical nature of cognitive human perception. At the core of the network lies a “structure-aware attention” module that weights and integrates various hierarchical visual cues associated with the DHR instance in the image. The feature hierarchy consists of three levels, namely the union, human, and joint levels, each of which extracts visual features relevant to the participants while modeling their cross-talk. We refer to this network as Structure-aware Attention Network (SAN). Experimental results show that SAN achieves accurate DHR robust to lacking visibility of actors, and outperforms past methods by 3.04 mAP on DHR-WebImages verb task.

引用

下载

页码：46947 / 46966

页数：19

共 50 条

[21] Understanding Long Programming Languages with Structure-Aware Sparse Attention
Liu, Tingting
Wang, Chengyu
Chen, Cen
Gao, Ming
Zhou, Aoying
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2093 - 2098
[22] An Articulated Structure-aware Network for 3D Human Pose Estimation
Tang, Zhenhua
Zhang, Xiaoyan
Hou, Junhui
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 48 - 63
[23] SALMNet: A Structure-Aware Lane Marking Detection Network
Xu, Xuemiao
Yu, Tianfei
Hu, Xiaowei
Ng, Wing W. Y.
Heng, Pheng-Ann
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (08) : 4986 - 4997
[24] AnatSwin: An anatomical structure-aware transformer network for cardiac MRI segmentation utilizing label images
Wang, Heying
Wang, Zhen
Wang, Xiqian
Wu, Zonghu
Yuan, Yongfeng
Li, Qince
NEUROCOMPUTING, 2024, 577
[25] Class structure-aware adversarial loss for cross-domain human action recognition
Chen, Wanjun
Liu, Long
Lin, Guangfeng
Chen, Yajun
Wang, Jing
IET IMAGE PROCESSING, 2021, 15 (14) : 3425 - 3432
[26] Structure-Aware Multi-scale Hierarchical Graph Convolutional Network for Skeleton Action Recognition
He, Changxiang
Liu, Shuting
Zhao, Ying
Qin, Xiaofei
Zeng, Jiayuan
Zhang, Xuedian
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 293 - 304
[27] Context-Aware Attention Network for Human Emotion Recognition in Video
Liu, Xiaodong
Wang, Miao
ADVANCES IN MULTIMEDIA, 2020, 2020
[28] Knowledge Structure-Aware Graph-Attention Networks for Knowledge Tracing
Mao, Shun
Zhan, Jieyu
Li, Jiawei
Jiang, Yuncheng
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2022, 13368 : 309 - 321
[29] HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization
Ca, Shuyang
Wang, Lu
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 786 - 807
[30] 3D human pose estimation via human structure-aware fully connected network
Zhang, Xiaoyan
Tang, Zhenhua
Hou, Junhui
Hao, Yanbin
PATTERN RECOGNITION LETTERS, 2019, 125 : 404 - 410

← 1 2 3 4 5 →