Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

被引:15
|
作者
Xu, Li [1 ]
Qu, Haoxuan [1 ]
Kuen, Jason [2 ]
Gu, Jiuxiang [2 ]
Liu, Jun [1 ]
机构
[1] Singapore Univ Technol & Design, Singapore, Singapore
[2] Adobe Res, San Jose, CA USA
来源
基金
新加坡国家研究基金会;
关键词
VidSGG; Long-tailed bias; Meta learning;
D O I
10.1007/978-3-031-19812-0_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.
引用
收藏
页码:374 / 390
页数:17
相关论文
共 50 条
  • [31] Video Segmentation Using Iterated Graph Cuts Based on Spatio-temporal Volumes
    Nagahashi, Tomoyuki
    Fujiyoshi, Hironobu
    Kanade, Takeo
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 655 - +
  • [32] (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering
    Cherian, Anoop
    Hori, Chiori
    Marks, Tim K.
    Le Roux, Jonathan
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 444 - 453
  • [33] VR plus HD: Video Semantic Reconstruction From Spatio-Temporal Scene Graphs
    Li, Chenxing
    Duan, Yiping
    Du, Qiyuan
    Sun, Shiqi
    Deng, Xin
    Tao, Xiaoming
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2023, 17 (05) : 935 - 948
  • [34] Video Segmentation with Spatio-Temporal Tubes
    Trichet, Remi
    Nevatia, Ramakant
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 330 - 335
  • [35] Spatio-temporal segmentation for video surveillance
    Sun, HZ
    Tan, TN
    ELECTRONICS LETTERS, 2001, 37 (01) : 20 - 21
  • [36] Spatio-temporal segmentation for video surveillance
    Sun, HZ
    Feng, T
    Tan, TN
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 843 - 846
  • [37] VideoZoom Spatio-Temporal Video Browser
    Smith, John R.
    IEEE TRANSACTIONS ON MULTIMEDIA, 1999, 1 (02) : 157 - 171
  • [38] Spatio-temporal video contrast enhancement
    Celik, Turgay
    IET IMAGE PROCESSING, 2013, 7 (06) : 543 - 555
  • [39] Spatio-Temporal Perturbations for Video Attribution
    Li, Zhenqiang
    Wang, Weimin
    Li, Zuoyue
    Huang, Yifei
    Sato, Yoichi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2043 - 2056
  • [40] Spatio-temporal querying in video databases
    Köprülü, M
    Çiçekli, NK
    Yazici, A
    FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, 2002, 2522 : 251 - 262