Three-stream network with context convolution module for human-object interaction detection

被引:4
|
作者
Siadari, Thomhert S. [1 ,2 ]
Han, Mikyong [2 ]
Yoon, Hyunjin [1 ,2 ]
机构
[1] Univ Sci & Technol, ETRI Sch, ICT Major, Daejeon, South Korea
[2] Elect & Telecommun Res Inst, City & Transportat ICT Res Dept, Daejeon, South Korea
关键词
context convolution module; deep learning; HOI detection; human-object interactions; three-stream network;
D O I
10.4218/etrij.2019-0230
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Human-object interaction (HOI) detection is a popular computer vision task that detects interactions between humans and objects. This task can be useful in many applications that require a deeper understanding of semantic scenes. Current HOI detection networks typically consist of a feature extractor followed by detection layers comprising small filters (eg, 1 x 1 or 3 x 3). Although small filters can capture local spatial features with a few parameters, they fail to capture larger context information relevant for recognizing interactions between humans and distant objects owing to their small receptive regions. Hence, we herein propose a three-stream HOI detection network that employs a context convolution module (CCM) in each stream branch. The CCM can capture larger contexts from input feature maps by adopting combinations of large separable convolution layers and residual-based convolution layers without increasing the number of parameters by using fewer large separable filters. We evaluate our HOI detection method using two benchmark datasets, V-COCO and HICO-DET, and demonstrate its state-of-the-art performance.
引用
收藏
页码:230 / 238
页数:9
相关论文
共 50 条
  • [21] Human-Object Interaction Recognition Based on Modeling Context
    Li, Shuyang
    Liang, Wei
    Zhang, Qun
    [J]. Journal of Beijing Institute of Technology (English Edition), 2017, 26 (02): : 215 - 222
  • [22] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    [J]. IMAGE AND VISION COMPUTING, 2021, 113
  • [23] Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities
    Yao, Bangpeng
    Li Fei-Fei
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 17 - 24
  • [24] Agglomerative Transformer for Human-Object Interaction Detection
    Tu, Danyang
    Sun, Wei
    Zhai, Guangtao
    Shen, Wei
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21557 - 21567
  • [25] Diagnosing Rarity in Human-object Interaction Detection
    Kilickaya, Mert
    Smeulders, Arnold
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3956 - 3960
  • [26] Lifelong Learning for Human-Object Interaction Detection
    Sun, Bo
    Lu, Sixu
    He, Jun
    Yu, Lejun
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 582 - 587
  • [27] Human-Object Interaction Detection with Missing Objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    [J]. PROCEEDINGS OF 17TH INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA 2021), 2021,
  • [28] Parallel Queries for Human-Object Interaction Detection
    Chen, Junwen
    Yanai, Keiji
    [J]. PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
  • [29] Pose attention and object semantic representation-based human-object interaction detection network
    Deng, Wei-Mo
    Zhang, Hong-Bo
    Lei, Qing
    Du, Ji-Xiang
    Huang, Min
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (27) : 39453 - 39470
  • [30] Enhanced Transformer Interaction Components for Human-Object Interaction Detection
    Zhang, JinHui
    Zhao, Yuxiao
    Zhang, Xian
    Wang, Xiang
    Zhao, Yuxuan
    Wang, Peng
    Hu, Jian
    [J]. ACM SYMPOSIUM ON SPATIAL USER INTERACTION, SUI 2023, 2023,