Learning to transfer focus of graph neural network for scene graph parsing

被引:17
|
作者
Jiang, Junjie [1 ]
He, Zaixing [1 ,2 ]
Zhang, Shuyou [2 ]
Zhao, Xinyue [2 ]
Tan, Jianrong [2 ]
机构
[1] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic relationship; Graphical focus; Scene graph; Class imbalance; Image understanding;
D O I
10.1016/j.patcog.2020.107707
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene graph parsing has become a new challenge in the field of image understanding and pattern recognition in recent years. It captures objects and their relationships, and provides a structured representation of the visual scene. Among the three types of high-level relationships of scene graphs, semantic relationships, which contain the global understanding of the scene, are the core and the most valuable, while geometric and possessive relationships contain local and limited information. However, semantic relationships have the characteristics of multiple types and fewer instances, leading to a low recognition rate of most semantic relationships by existing detectors. To address this issue, this paper proposes a new architecture, the graphical focal network, which uses a decision-level global detector to capture the dependencies between object and relationship local detectors. We construct a graphical focal loss, which overcomes the lack of semantic relationship instances by adjusting the proportion of relationship loss based on the degree of relationship rarity and learning difficulty, and improves the stability of key object recognition by adjusting the proportion of object loss based on the degree of node connectivity and the value of neighborhood relationships. The proposed relative depth encoding module and regional layout encoding module, respectively, introduce relative depth information and more effective geometric layout information between objects, thereby further improving the performance. Experiments using the Visual Genome benchmark show that our method outperforms the most advanced competitors in two types of performance metrics. For semantic types, the recognition rate of our method is 2.0 times that of the baseline. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Adaptive Graph Neural Network with Incremental Learning Mechanism for Knowledge Graph Reasoning
    Zhang, Junhui
    Zan, Hongying
    Wu, Shuning
    Zhang, Kunli
    Huo, Jianwei
    [J]. ELECTRONICS, 2024, 13 (14)
  • [42] DGSLN: Differentiable graph structure learning neural network for robust graph representations
    Zou, Xiaofeng
    Li, Kenli
    Chen, Cen
    Yang, Xulei
    Wei, Wei
    Li, Keqin
    [J]. INFORMATION SCIENCES, 2023, 626 : 94 - 113
  • [43] Preserving node similarity adversarial learning graph representation with graph neural network
    Yang, Shangying
    Zhang, Yinglong
    Jiawei, E.
    Xia, Xuewen
    Xu, Xing
    [J]. ENGINEERING REPORTS, 2024, 6 (10)
  • [44] PARSING OF EDNLC GRAPH-GRAMMARS FOR SCENE ANALYSIS - REPLY
    FLASINSKI, M
    [J]. PATTERN RECOGNITION, 1990, 23 (3-4) : 405 - 405
  • [45] Use of random graph parsing for scene labelling by probabilistic relaxation
    Skomorowski, M
    [J]. PATTERN RECOGNITION LETTERS, 1999, 20 (09) : 949 - 956
  • [46] Zero-Shot Predicate Prediction for Scene Graph Parsing
    Li, Yiming
    Yang, Xiaoshan
    Huang, Xuhui
    Ma, Zhe
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3140 - 3153
  • [47] Multitask Learning on Graph Neural Networks: Learning Multiple Graph Centrality Measures with a Unified Network
    Avelar, Pedro
    Lemos, Henrique
    Prates, Marcelo
    Lamb, Luis
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 : 701 - 715
  • [48] PARSING AND TRANSLATION OF (ATTRIBUTED) EXPANSIVE GRAPH LANGUAGES FOR SCENE ANALYSIS
    SHI, QY
    FU, KS
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1983, 5 (05) : 472 - 485
  • [49] Partial Label Learning with competitive learning graph neural network
    Fan, Jinfu
    Yu, Yang
    Wang, Zhongjie
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 111
  • [50] Learning Graph Neural Networks for Image Style Transfer
    Jing, Yongcheng
    Mao, Yining
    Yang, Yiding
    Zhan, Yibing
    Song, Mingli
    Wang, Xinchao
    Tao, Dacheng
    [J]. COMPUTER VISION, ECCV 2022, PT VII, 2022, 13667 : 111 - 128