Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

被引:15
|
作者
Xu, Li [1 ]
Qu, Haoxuan [1 ]
Kuen, Jason [2 ]
Gu, Jiuxiang [2 ]
Liu, Jun [1 ]
机构
[1] Singapore Univ Technol & Design, Singapore, Singapore
[2] Adobe Res, San Jose, CA USA
来源
基金
新加坡国家研究基金会;
关键词
VidSGG; Long-tailed bias; Meta learning;
D O I
10.1007/978-3-031-19812-0_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.
引用
收藏
页码:374 / 390
页数:17
相关论文
共 50 条
  • [1] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
    Lianggangxu Chen
    Yiqing Cai
    Changhong Lu
    Changbo Wang
    Gaoqi He
    Multimedia Tools and Applications, 2023, 82 : 38947 - 38966
  • [2] Video-based spatio-temporal scene graph generation with efficient self-supervision tasks
    Chen, Lianggangxu
    Cai, Yiqing
    Lu, Changhong
    Wang, Changbo
    He, Gaoqi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 38947 - 38966
  • [3] Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling
    Zhao, Yu
    Fei, Hao
    Cao, Yixin
    Li, Bobo
    Zhang, Meishan
    Wei, Jianguo
    Zhang, Min
    Chua, Tat-Seng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5281 - 5291
  • [4] Video Relation Detection with Spatio-Temporal Graph
    Qian, Xufeng
    Zhuang, Yueting
    Li, Yimeng
    Xiao, Shaoning
    Pu, Shiliang
    Xiao, Jun
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 84 - 93
  • [5] Exploring the Spatio-Temporal Aware Graph for video captioning
    Xue, Ping
    Zhou, Bing
    IET COMPUTER VISION, 2022, 16 (05) : 456 - 467
  • [6] Informative Scene Graph Generation via Debiasing
    Gao, Lianli
    Lyu, Xinyu
    Guo, Yuyu
    Hu, Yuxuan
    Li, Yuan-Fang
    Xu, Lu
    Shen, Heng Tao
    Song, Jingkuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [7] Random Generation of a Locally Consistent Spatio-Temporal Graph
    Leborgne, Aurelie
    Kirandjiska, Marija
    Le Ber, Florence
    GRAPH-BASED REPRESENTATION AND REASONING (ICCS 2021), 2021, 12879 : 155 - 169
  • [8] Video Synopsis Generation Using Spatio-Temporal Groups
    Ahmed, A.
    Kar, S.
    Dogra, D. P.
    Patnaik, R.
    Lee, S.
    Choi, H.
    Kim, I.
    2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS (ICSIPA), 2017, : 512 - 517
  • [9] Video Generation for High Spatio-temporal Resolution Imaging
    Imagawa, T.
    Azuma, T.
    Nobori, K.
    Motomura, H.
    2009 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2009, : 151 - 152
  • [10] ImaGINator: Conditional Spatio-Temporal GAN for Video Generation
    Wang, Yaohui
    Bilinski, Piotr
    Bremond, Francois
    Dantcheva, Antitza
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1149 - 1158