Global Spectral Filter Memory Network for Video Object Segmentation

被引:13
|
作者
Liu, Yong [1 ,2 ]
Yu, Ran [1 ]
Wang, Jiahao [1 ]
Zhao, Xinyuan [3 ]
Wang, Yitong [2 ]
Tang, Yansong [1 ]
Yang, Yujiu [1 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] ByteDance Inc, Beijing, Peoples R China
[3] Northwestern Univ, Evanston, IL USA
来源
基金
中国国家自然科学基金;
关键词
Video object segmentation; Spectral domain;
D O I
10.1007/978-3-031-19818-2_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies semi-supervised video object segmentation through boosting intra-frame interaction. Recent memory network-based methods focus on exploiting inter-frame temporal reference while paying little attention to intra-frame spatial dependency. Specifically, these segmentation model tends to be susceptible to interference from unrelated nontarget objects in a certain frame. To this end, we propose Global Spectral Filter Memory network (GSFM), which improves intraframe interaction through learning long-term spatial dependencies in the spectral domain. The key components of GSFM is 2D (inverse) discrete Fourier transform for spatial information mixing. Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head). We attribute this to semantic information extracting role for encoder and fine-grained details highlighting role for decoder. Thus, Low (High) Frequency Module is proposed to fit this circumstance. Extensive experiments on the popular DAVIS and YouTube-VOS benchmarks demonstrate that GSFM noticeably outperforms the baseline method and achieves state-of-the-art performance. Besides, extensive analysis shows that the proposed modules are reasonable and of great generalization ability.
引用
收藏
页码:648 / 665
页数:18
相关论文
共 50 条
  • [1] Modulated Memory Network for Video Object Segmentation
    Lu, Hannan
    Guo, Zixian
    Zuo, Wangmeng
    MATHEMATICS, 2024, 12 (06)
  • [2] Efficient Regional Memory Network for Video Object Segmentation
    Xie, Haozhe
    Yao, Hongxun
    Zhou, Shangchen
    Zhang, Shengping
    Sun, Wenxiu
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1286 - 1295
  • [3] Robust and Efficient Memory Network for Video Object Segmentation
    Chen, Yadang
    Zhang, Dingwei
    Yang, Zhi-Xin
    Wu, Enhua
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1769 - 1774
  • [4] Hierarchical Memory Matching Network for Video Object Segmentation
    Seong, Hongje
    Oh, Seoung Wug
    Lee, Joon-Young
    Lee, Seongwon
    Lee, Suhyeon
    Kim, Euntai
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12869 - 12878
  • [5] Unsupervised Video Object Segmentation via Prototype Memory Network
    Yonsei University, Korea, Republic of
    不详
    Proc. - IEEE Winter Conf. Appl. Comput. Vis., WACV, 1600, (5913-5923):
  • [6] Dual Temporal Memory Network for Efficient Video Object Segmentation
    Zhang, Kaihua
    Wang, Long
    Liu, Dong
    Liu, Bo
    Liu, Qingshan
    Li, Zhu
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1515 - 1523
  • [7] Unsupervised Video Object Segmentation via Prototype Memory Network
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Seunghoon
    Park, Chaewon
    Lee, Sangyoun
    arXiv, 2022,
  • [8] Unsupervised Video Object Segmentation via Prototype Memory Network
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Seunghoon
    Park, Chaewon
    Lee, Sangyoun
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5913 - 5923
  • [9] Video Object Segmentation Using Kernelized Memory Network With Multiple Kernels
    Seong, Hongje
    Hyun, Junhyuk
    Kim, Euntai
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2595 - 2612
  • [10] Boosting Video Object Segmentation via Robust and Efficient Memory Network
    Chen, Yadang
    Zhang, Dingwei
    Zheng, Yuhui
    Yang, Zhi-Xin
    Wu, Enhua
    Zhao, Haixing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3340 - 3352