Fast pixel-matching for video object segmentation

被引:8
|
作者
Yu, Siyue [1 ]
Xiao, Jimin [1 ]
Zhang, Bingfeng [1 ]
Lim, Eng Gee [1 ]
Zhao, Yao [2 ]
机构
[1] Xian Jiaotong Liverpool Univ, Suzhou, Jiangsu, Peoples R China
[2] Beijing Jiaotong Univ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Non-local pixel matching; Mask-propagation; Encoder-decoder;
D O I
10.1016/j.image.2021.116373
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video object segmentation, aiming to segment the foreground objects given the annotation of the first frame, has been attracting increasing attentions. Many state-of-the-art approaches have achieved great performance by relying on online model updating or mask-propagation techniques. However, most online models require high computational cost due to model fine-tuning during inference. Most mask-propagation based models are faster but with relatively low performance due to failure to adapt to object appearance variation. In this paper, we are aiming to design a new model to make a good balance between speed and performance. We propose a model, called NPMCA-net, which directly localizes foreground objects based on mask-propagation and non-local technique by matching pixels in reference and target frames. Since we bring in information of both first and previous frames, our network is robust to large object appearance variation, and can better adapt to occlusions. Extensive experiments show that our approach can achieve a new state-of-the-art performance with a fast speed at the same time (86.5% IoU on DAVIS-2016 and 72.2% IoU on DAVIS-2017, with speed of 0.11s per frame) under the same level comparison. Source code is available at https://github.com/siyueyu/NPMCA-net.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] PRM: A Pixel-Region-Matching Approach for Fast Video Object Segmentation
    Wang, Zhen
    Zhang, Wenbin
    Li, Jiahao
    Zhu, Wencheng
    Song, Danqing
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT IV, 2025, 15034 : 76 - 91
  • [2] Pixel-Level Bijective Matching for Video Object Segmentation
    Cho, Suhwan
    Lee, Heansung
    Kim, Minjung
    Jang, Sungjun
    Lee, Sangyoun
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1453 - 1462
  • [3] Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning
    Chen, Yuhua
    Pont-Tuset, Jordi
    Montes, Alberto
    Van Gool, Luc
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1189 - 1198
  • [4] Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks
    Yoon, Jae Shin
    Rameau, Francois
    Kim, Junsik
    Lee, Seokju
    Shin, Seunghak
    Kweon, In So
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2186 - 2195
  • [5] Texture as pixel feature for video object segmentation
    Ahmed, R.
    Karmakar, G. C.
    Dooley, L. S.
    ELECTRONICS LETTERS, 2008, 44 (19) : 1126 - U12
  • [6] Fast Video Object Segmentation with Temporal Aggregation Network and Dynamic Template Matching
    Huang, Xuhua
    Xu, Jiarui
    Tai, Yu-Wing
    Tang, Chi-Keung
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 8876 - 8886
  • [7] Fast object segmentation in unconstrained video
    Papazoglou, Anestis
    Ferrari, Vittorio
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1777 - 1784
  • [8] VideoMatch: Matching Based Video Object Segmentation
    Hu, Yuan-Ting
    Huang, Jia-Bin
    Schwing, Alexander G.
    COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 56 - 73
  • [9] Prototypical Matching Networks for Video Object Segmentation
    Lin, Fanchao
    Qiu, Zhaofan
    Liu, Chuanbin
    Yao, Ting
    Xie, Hongtao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5623 - 5636
  • [10] A Fast and Automatic Video Object Segmentation Technique
    Guo Lihua
    2008 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLS 1 AND 2: VOL 1: COMMUNICATION THEORY AND SYSTEM - VOL 2: SIGNAL PROCESSING, COMPUTATIONAL INTELLIGENCE, CIRCUITS AND SYSTEMS, 2008, : 806 - 809