BaSSL: Boundary-aware Self-Supervised Learning for Video Scene Segmentation

被引：0

作者：

Mun, Jonghwan ^{[1
]}

Shin, Minchul ^{[1
]}

Han, Gunsoo ^{[1
]}

Lee, Sangho ^{[2
]}

Ha, Seongsu ^{[2
]}

Lee, Joonseok ^{[2
]}

Kim, Eun-Sol ^{[3
]}

机构：

[1] Kakao Brain, Seongnam, South Korea

[2] Seoul Natl Univ, Grad Sch Data Sci, Seoul, South Korea

[3] Hanyang Univ, Dept Comp Sci, Seoul, South Korea

来源：

COMPUTER VISION - ACCV 2022, PT IV | 2023年 / 13844卷

关键词：

Video scene segmentation; Self-supervised learning;

D O I：

10.1007/978-3-031-26316-3_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Self-supervised learning has drawn attention through its effectiveness in learning in-domain representations with no ground-truth annotations; in particular, it is shown that properly designed pretext tasks bring significant performance gains for downstream tasks. Inspired from this, we tackle video scene segmentation, which is a task of temporally localizing scene boundaries in a long video, with a self-supervised learning framework where we mainly focus on designing effective pretext tasks. In our framework, given a long video, we adopt a sliding window scheme; from a sequence of shots in each window, we discover a moment with a maximum semantic transition and leverage it as pseudo-boundary to facilitate the pre-training. Specifically, we introduce three novel boundary-aware pretext tasks: 1) Shot-Scene Matching (SSM), 2) Contextual Group Matching (CGM) and 3) Pseudo-boundary Prediction (PP); SSM and CGM guide the model to maximize intra-scene similarity and inter-scene discrimination by capturing contextual relation between shots while PP encourages the model to identify transitional moments. We perform an extensive analysis to validate effectiveness of our method and achieve the new state-of-the-art on the MovieNet-SSeg benchmark. The code is available at https://github.com/kakaobrain/bassl.

引用

页码：485 / 501

页数：17

共 50 条

[1] Boundary-aware information maximization for self-supervised medical image segmentation
Peng, Jizong
Wang, Ping
Pedersoli, Marco
Desrosiers, Christian
[J]. MEDICAL IMAGE ANALYSIS, 2024, 94
[2] Boundary-Aware Feature Propagation for Scene Segmentation
Ding, Henghui
Jiang, Xudong
Liu, Ai Qun
Thalmann, Nadia Magnenat
Wang, Gang
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6818 - 6828
[3] Temporal Scene Montage for Self-Supervised Video Scene Boundary Detection
Tan, Jiawei
Yang, Pingan
Chen, Lu
Wang, Hongxing
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)
[4] VIDEO SEGMENTATION VIA BOUNDARY-AWARE FLOW
Chen, Ding-Jie
Chen, Hwann-Tzong
Chang, Long-Wen
[J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 3340 - 3344
[5] COLOR-AWARE SELF-SUPERVISED LEARNING FOR SCENE CLASSIFICATION AND SEGMENTATION OF REMOTE SENSING IMAGES
Xu, Guozheng
Jiang, Xue
Liu, Xingzhao
[J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5049 - 5052
[6] Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Chen, Shixing
Nie, Xiaohan
Fan, David
Zhang, Dongqing
Bhat, Vimal
Hamid, Raffay
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9791 - 9800
[7] Learning disentangled representation for self-supervised video object segmentation
Hou, Wenjie
Qin, Zheyun
Xi, Xiaoming
Lu, Xiankai
Yin, Yilong
[J]. NEUROCOMPUTING, 2022, 481 : 270 - 280
[8] Learning disentangled representation for self-supervised video object segmentation
Hou, Wenjie
Qin, Zheyun
Xi, Xiaoming
Lu, Xiankai
Yin, Yilong
[J]. Neurocomputing, 2022, 481 : 270 - 280
[9] Actor-Aware Self-Supervised Learning for Semi-Supervised Video Representation Learning
Assefa, Maregu
Jiang, Wei
Alemu, Kumie Gedamu
Yilma, Getinet
Adhikari, Deepak
Ayalew, Melese
Seid, Abegaz Mohammed
Erbad, Aiman
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6679 - 6692
[10] A Boundary-aware Distillation Network for Compressed Video Semantic Segmentation
Lu, Hongchao
Deng, Zhidong
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5354 - 5359

← 1 2 3 4 5 →