Compositional Feature Augmentation for Unbiased Scene Graph Generation

被引:11
|
作者
Li, Lin [1 ,2 ]
Chen, Guikun [1 ]
Xiao, Jun [1 ]
Yang, Yi [1 ]
Wang, Chunping [3 ]
Chen, Long [2 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[3] FinVolut, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01982
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene Graph Generation (SGG) aims to detect all the visual relation triplets <sub, pred, obj> in a given image. With the emergence of various advanced techniques for better utilizing both the intrinsic and extrinsic information in each relation triplet, SGG has achieved great progress over the recent years. However, due to the ubiquitous long-tailed predicate distributions, today's SGG models are still easily biased to the head predicates. Currently, the most prevalent debiasing solutions for SGG are re-balancing methods, e.g., changing the distributions of original training samples. In this paper, we argue that all existing re-balancing strategies fail to increase the diversity of the relation triplet features of each predicate, which is critical for robust SGG. To this end, we propose a novel Compositional Feature Augmentation (CFA) strategy, which is the first unbiased SGG work to mitigate the bias issue from the perspective of increasing the diversity of triplet features. Specifically, we first decompose each relation triplet feature into two components: intrinsic feature and extrinsic feature, which correspond to the intrinsic characteristics and extrinsic contexts of a relation triplet, respectively. Then, we design two different feature augmentation modules to enrich the feature diversity of original relation triplets by replacing or mixing up either their intrinsic or extrinsic features from other samples. Due to its model-agnostic nature, CFA can be seamlessly incorporated into various SGG frameworks. Extensive ablations have shown that CFA achieves a new state-of-the-art performance on the trade-off between different metrics.
引用
收藏
页码:21628 / 21638
页数:11
相关论文
共 50 条
  • [41] Weakly-supervised Video Scene Graph Generation via Unbiased Cross-modal Learning
    Wu, Ziyue
    Gao, Junyu
    Xu, Changsheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4574 - 4583
  • [42] Taking a Closer Look At Visual Relation: Unbiased Video Scene Graph Generation With Decoupled Label Learning
    Wang, Wenqing
    Luo, Yawei
    Chen, Zhiqing
    Jiang, Tao
    Yang, Yi
    Xiao, Jun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5718 - 5728
  • [43] Refine and Redistribute: Multi-Domain Fusion and Dynamic Label Assignment for Unbiased Scene Graph Generation
    Zhang, Yujie
    Li, Yaochen
    Gao, Yuan
    Guo, Yimou
    Tang, Wenneng
    Li, Yanxue
    Atlaw, Meklit
    2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 1307 - 1316
  • [44] Unconditional Scene Graph Generation
    Garg, Sarthak
    Dhamo, Helisa
    Farshad, Azade
    Musatian, Sabrina
    Navab, Nassir
    Tombari, Federico
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16342 - 16351
  • [45] Iterative Scene Graph Generation
    Khandelwal, Siddhesh
    Sigal, Leonid
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] Panoptic Scene Graph Generation
    Yang, Jingkang
    Ang, Yi Zhe
    Guo, Zujin
    Zhou, Kaiyang
    Zhang, Wayne
    Liu, Ziwei
    COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 178 - 196
  • [47] Unbiased scene graph generation via head-tail cooperative network with self-supervised learning
    Wang, Lei
    Yuan, Zejian
    Lu, Yao
    Chen, Badong
    IMAGE AND VISION COMPUTING, 2024, 151
  • [48] Beware of Overcorrection: Scene-induced Commonsense Graph for Scene Graph Generation
    Chen, Lianggangxu
    Lu, Jiale
    Song, Youqi
    Wang, Changbo
    He, Gaoqi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2888 - 2897
  • [49] Multimodal graph inference network for scene graph generation
    Jingwen Duan
    Weidong Min
    Deyu Lin
    Jianfeng Xu
    Xin Xiong
    Applied Intelligence, 2021, 51 : 8768 - 8783
  • [50] Multimodal graph inference network for scene graph generation
    Duan, Jingwen
    Min, Weidong
    Lin, Deyu
    Xu, Jianfeng
    Xiong, Xin
    APPLIED INTELLIGENCE, 2021, 51 (12) : 8768 - 8783