GroupRF: Panoptic Scene Graph Generation with group relation tokens

被引:0
|
作者
Wang, Hongyun [1 ,2 ]
Li, Jiachen [1 ,2 ]
Xiang, Xiang [3 ]
Xie, Qing [1 ,2 ]
Ma, Yanchun [1 ,2 ]
Liu, Yongjian [1 ,2 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Hubei, Peoples R China
[2] Minist Educ, Engn Res Ctr Intelligent Serv Technol Digital Publ, Wuhan 430070, Hubei, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Panoptic Scene Graph Generation; Multiple relation token; Fine-grained interaction;
D O I
10.1016/j.jvcir.2025.104405
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Panoptic Scene Graph Generation (PSG) aims to predict a variety of relations between pairs of objects within an image, and indicate the objects by panoptic segmentation masks instead of bounding boxes. Existing PSG methods attempt to straightforwardly fuse the object tokens for relation prediction, thus failing to fully utilize the interaction between the pairwise objects. To address this problem, we propose a novel framework named Group RelationFormer (GroupRF) to capture the fine-grained inter-dependency among all instances. Our method introduce a set of learnable tokens termed group rln tokens, which exploit fine-grained contextual interaction between object tokens with multiple attentive relations. In the process of relation prediction, we adopt multiple triplets to take advantage of the fine-grained interaction included in group rln tokens. We conduct comprehensive experiments on OpenPSG dataset, which show that our method outperforms the previous state-of-the-art method. Furthermore, we also show the effectiveness of our framework by ablation studies. Our code is available at https://github.com/WHY-student/GroupRF.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Prediction and Generation of 3D Functional Scene Based on Relation Graph
    Sun Q.
    Hu R.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (09): : 1351 - 1361
  • [32] Beware of Overcorrection: Scene-induced Commonsense Graph for Scene Graph Generation
    Chen, Lianggangxu
    Lu, Jiale
    Song, Youqi
    Wang, Changbo
    He, Gaoqi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2888 - 2897
  • [33] Multimodal graph inference network for scene graph generation
    Jingwen Duan
    Weidong Min
    Deyu Lin
    Jianfeng Xu
    Xin Xiong
    Applied Intelligence, 2021, 51 : 8768 - 8783
  • [34] Multimodal graph inference network for scene graph generation
    Duan, Jingwen
    Min, Weidong
    Lin, Deyu
    Xu, Jianfeng
    Xiong, Xin
    APPLIED INTELLIGENCE, 2021, 51 (12) : 8768 - 8783
  • [35] Graph R-CNN for Scene Graph Generation
    Yang, Jianwei
    Lu, Jiasen
    Lee, Stefan
    Batra, Dhruv
    Parikh, Devi
    COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 690 - 706
  • [36] Knowledge-Enhanced Scene Graph Generation with Multimodal Relation Alignment (Student Abstract)
    Fu, Ze
    Feng, Junhao
    Zheng, Changmeng
    Cai, Yi
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12947 - 12948
  • [37] Scene Graph Generation: A comprehensive survey
    Li, Hongsheng
    Zhu, Guangming
    Zhang, Liang
    Jiang, Youliang
    Dang, Yixuan
    Hou, Haoran
    Shen, Peiyi
    Zhao, Xia
    Shah, Syed Afaq Ali
    Bennamoun, Mohammed
    NEUROCOMPUTING, 2024, 566
  • [38] Unbiased Scene Graph Generation in Videos
    Nag, Sayak
    Min, Kyle
    Tripathi, Subama
    Roy-Chowdhury, Amit K.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22803 - 22813
  • [39] Fully Convolutional Scene Graph Generation
    Liu, Hengyue
    Yan, Ning
    Mortazavi, Masood
    Bhanu, Bir
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11541 - 11551
  • [40] Review on scene graph generation methods
    Monesh, S.
    Senthilkumar, N. C.
    MULTIAGENT AND GRID SYSTEMS, 2024, 20 (02) : 129 - 160