Deep relational self-Attention networks for scene graph generation

被引:3
|
作者
Li, Ping [1 ]
Yu, Zhou [1 ]
Zhan, Yibing [1 ,2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou, Peoples R China
[2] JD Explore Acad, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Scene graph generation; Image understanding; Deep neural networks;
D O I
10.1016/j.patrec.2021.12.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graph generation (SGG) aims to simultaneously detect objects in an image and predict relations for these detected objects. SGG is challenging that requires modeling the contextualized relationships among objects rather than only considering relationships between paired objects. Most existing approaches ad -dress this problem by using a CNN or RNN framework, which can not explicitly and effectively models the dense interactions among objects. In this paper, we exploit the attention mechanism and introduce a relational self-attention (RSA) module to simultaneously model the object and relation contexts. By stack -ing such RSA modules in depth, we obtain a deep relational self-attention network (RSAN), which is able to characterize complex interactions thus facilitating the understanding of object and relation semantics. Extensive experiments on the benchmark Visual Genome dataset demonstrate the effectiveness of RSAN. (c) 2021 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:200 / 206
页数:7
相关论文
共 50 条
  • [1] Deep relational self-Attention networks for scene graph generation
    Li, Ping
    Yu, Zhou
    Zhan, Yibing
    Pattern Recognition Letters, 2022, 153 : 200 - 206
  • [2] Universal Graph Transformer Self-Attention Networks
    Dai Quoc Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 193 - 196
  • [3] On the Global Self-attention Mechanism for Graph Convolutional Networks
    Wang, Chen
    Deng, Chengyuan
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8531 - 8538
  • [4] A unified deep sparse graph attention network for scene graph generation
    Zhou, Hao
    Yang, Yazhou
    Luo, Tingjin
    Zhang, Jun
    Li, Shuohao
    PATTERN RECOGNITION, 2022, 123
  • [5] Zero-shot Scene Graph Generation with Relational Graph Neural Networks
    Yu, Xiang
    Li, Jie
    Yuan, Shijing
    Wang, Chao
    Wu, Chentao
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1894 - 1900
  • [6] Self-Attention Graph Pooling
    Lee, Junhyun
    Lee, Inyeop
    Kang, Jaewoo
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Self-Attention Based Sequential Recommendation With Graph Convolutional Networks
    Seng, Dewen
    Wang, Jingchang
    Zhang, Xuefeng
    IEEE ACCESS, 2024, 12 : 32780 - 32787
  • [8] Transformer-based Scene Graph Generation Network With Relational Attention Module
    Yamamoto, Takuma
    Obinata, Yuya
    Nakayama, Osafumi
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2034 - 2041
  • [9] Deep Generative Probabilistic Graph Neural Networks for Scene Graph Generation
    Khademi, Mahmoud
    Schulte, Oliver
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11237 - 11245
  • [10] Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks
    Dasoulas, George
    Scaman, Kevin
    Virmaux, Aladin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139