A Novel End-to-End Transformer for Scene Graph Generation

被引:0
|
作者
Ren, Chengkai [1 ]
Liu, Xiuhua [2 ]
Cao, Mengyuan [2 ]
Zhang, Jian [1 ]
Wang, Hongwei [1 ]
机构
[1] Zhejiang Univ, ZJU UIUC Inst, Haining, Peoples R China
[2] Intelligent Sci & Technol Acad CASIC, Beijing, Peoples R China
关键词
Transformer; Scene Graph; Scene Understanding; End-to-end; Visual Relationship Detection;
D O I
10.1109/IJCNN54540.2023.10191798
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An image usually contains not only visual information but also higher-level semantic information. Nevertheless, previous computer vision algorithms, such as target detection and image classification, use only the visual features of the image alone. Recently, the explosion of scene graphs in computer vision has led to the challenge of generating structured scene graphs with rich semantic information. This paper proposes a one-stage query-based end-to-end Transformer model and generates scene graphs using the Hungarian matching algorithm. We develop an anti-bias reasoner module to reduce the impact of the unbalanced data distribution. Time-division training strategy is proposed to improve model training efficiency and speed up model convergence while improving model training performance. Experiments on the large-scale dataset Visual Genome were conducted in order to confirm the validity of our method. Compared with the existing state-of-the-art method, our method guarantees inference speed while maintaining acceptable performance and is more suitable for tasks with high real-time performance. Our work demonstrates that the one-stage method has great potential for exploration in scene graph generation.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Sequential Transformer for End-to-End Person Search
    Chen, Long
    Xu, Jinhua
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 226 - 238
  • [22] Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing
    Chen, Bo
    Sun, Le
    Han, Xianpei
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 766 - 777
  • [23] End-to-End Temporal Action Detection With Transformer
    Liu, Xiaolong
    Wang, Qimeng
    Hu, Yao
    Tang, Xu
    Zhang, Shiwei
    Bai, Song
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5427 - 5441
  • [24] End-to-end lane detection with convolution and transformer
    Zekun Ge
    Chao Ma
    Zhumu Fu
    Shuzhong Song
    Pengju Si
    [J]. Multimedia Tools and Applications, 2023, 82 : 29607 - 29627
  • [25] A NOVEL END-TO-END SPEECH EMOTION RECOGNITION NETWORK WITH STACKED TRANSFORMER LAYERS
    Wang, Xianfeng
    Wang, Min
    Qi, Wenbo
    Su, Wanqi
    Wang, Xiangqian
    Zhou, Huan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6289 - 6293
  • [26] Transforming Scene Text Detection and Recognition: A Multi-Scale End-to-End Approach With Transformer Framework
    Geng, Tianyu
    [J]. IEEE ACCESS, 2024, 12 : 40582 - 40596
  • [27] End-to-End Argumentation Knowledge Graph Construction
    Al-Khatib, Khalid
    Hou, Yufang
    Wachsmuth, Henning
    Jochim, Charles
    Bonin, Francesca
    Stein, Benno
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7367 - 7374
  • [28] Targeted End-to-End Knowledge Graph Decomposition
    Skrlj, Blaz
    Kralj, Jan
    Lavrac, Nada
    [J]. INDUCTIVE LOGIC PROGRAMMING (ILP 2018), 2018, 11105 : 157 - 171
  • [29] End-to-End NLP Knowledge Graph Construction
    Mondal, Ishani
    Hou, Yufang
    Jochim, Charles
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1885 - 1895
  • [30] An End-to-End Scene Text Detector with Dynamic Attention
    Lin, Jingyu
    Yan, Yan
    Wang, Hanzi
    [J]. PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,