Towards Global Video Scene Segmentation with Context-Aware Transformer

被引:0
|
作者
Yang, Yang [1 ,2 ,3 ]
Huang, Yurui [1 ]
Guo, Weili [1 ]
Xu, Baohua [4 ]
Xia, Dingyin
机构
[1] Nanjing Univ Sci & Technol, Nanjing, Peoples R China
[2] NUAA, MIIT Key Lab Pattern Anal & Machine Intelligence, Nanjing, Peoples R China
[3] NJU, State Key Lab Novel Software Technol, Nanjing, Peoples R China
[4] HUAWEI CBG Edu Lab, Montreal, PQ, Canada
来源
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Videos such as movies or TV episodes usually need to divide the long storyline into cohesive units, i.e., scenes, to facilitate the understanding of video semantics. The key challenge lies in finding the boundaries of scenes by comprehensively considering the complex temporal structure and semantic in-formation. To this end, we introduce a novel Context-Aware Transformer (CAT) with a self-supervised learning framework to learn high-quality shot representations, for generating well-bounded scenes. More specifically, we design the CAT with local-global self-attentions, which can effectively consider both the long-term and short-term context to improve the shot encoding. For training the CAT, we adopt the self-supervised learning schema. Firstly, we leverage shot-to-scene level pretext tasks to facilitate the pre-training with pseudo boundary, which guides CAT to learn the discriminative shot representations that maximize intra-scene similarity and inter-scene discrimination in an unsupervised manner. Then, we transfer contextual representations for fine-tuning the CAT with supervised data, which encourages CAT to accurately detect the boundary for scene segmentation. As a result, CAT is able to learn the context-aware shot representations and provides global guidance for scene segmentation. Our empirical analyses show that CAT can achieve state-of-the-art performance when conducting the scene segmentation task on the MovieNet dataset, e.g., offering 2.15 improvements on AP.
引用
收藏
页码:3206 / 3213
页数:8
相关论文
共 50 条
  • [31] Scene Context-aware Rapidly-exploring Random Trees for Global Path Planning
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS), 2019, : 608 - 613
  • [32] Towards a Context-Aware Composition of Services
    Baidouri, Hicham
    Hafiddi, Hatim
    Nassar, Mahmoud
    Kriouile, Abdelaziz
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2012, 12 (03): : 133 - 140
  • [33] Towards a context-aware service directory
    Doulkeridis, C
    Valavanis, E
    Vazirgiannis, M
    TECHNOLOGIES FOR E-SERVICES, PROCEEDINGS, 2003, 2819 : 54 - 65
  • [34] RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection
    Li, Jiachen
    Lin, Yuan
    Liu, Rongrong
    Ho, Chiu Man
    Shi, Humphrey
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2349 - 2358
  • [35] CARIS: Context-Aware Referring Image Segmentation
    Liu, Sun-Ao
    Zhang, Yiheng
    Qiu, Zhaofan
    Xie, Hongtao
    Zhang, Yongdong
    Yao, Ting
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 779 - 788
  • [36] Dense Material Segmentation with Context-Aware Network
    Heng, Yuwen
    Wu, Yihong
    Dasmahapatra, Srinandan
    Kim, Hansung
    COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VISIGRAPP 2022, 2023, 1815 : 66 - 88
  • [37] CNet: Context-Aware Network for Semantic Segmentation
    Cheng, Rongliang
    Zhang, Junge
    Yang, Peipei
    Liu, Kangwei
    Zhang, Shujun
    PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 67 - 72
  • [38] Context-aware local abnormality detection in crowded scene
    Zhu XiaoBin
    Jin Xin
    Zhang XiaoYu
    Li ChangSheng
    He FuGang
    Wang Lei
    SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (05) : 1 - 11
  • [39] Context-Aware Domain Adaptation in Semantic Segmentation
    Yang, Jinyu
    An, Weizhi
    Yan, Chaochao
    Zhao, Peilin
    Huang, Junzhou
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 514 - 524
  • [40] Learning Context-Aware Classifier for Semantic Segmentation
    Tian, Zhuotao
    Cui, Jiequan
    Jiang, Li
    Qi, Xiaojuan
    Lai, Xin
    Chen, Yixin
    Liu, Shu
    Jia, Jiaya
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2438 - 2446