Mask-Free Video Instance Segmentation

被引:9
|
作者
Ke, Lei [1 ,2 ]
Danelljan, Martin [1 ]
Ding, Henghui [1 ]
Tai, Yu-Wing [2 ]
Tang, Chi-Keung [2 ]
Yu, Fisher [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] HKUST, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.02189
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent advancement in Video Instance Segmentation (VIS) has largely been driven by the use of deeper and increasingly data-hungry transformer-based models. However, video masks are tedious and expensive to annotate, limiting the scale and diversity of existing VIS datasets. In this work, we aim to remove the mask-annotation requirement. We propose MaskFreeVIS, achieving highly competitive VIS performance, while only using bounding box annotations for the object state. We leverage the rich temporal mask consistency constraints in videos by introducing the Temporal KNN-patch Loss (TK-Loss), providing strong mask supervision without any labels. Our TK-Loss finds one-to-many matches across frames, through an efficient patch-matching step followed by a K-nearest neighbor selection. A consistency loss is then enforced on the found matches. Our mask-free objective is simple to implement, has no trainable parameters, is computationally efficient, yet outperforms baselines employing, e.g., state-of-the-art optical flow to enforce temporal mask consistency. We validate MaskFreeVIS on the YouTube-VIS 2019/2021, OVIS and BDD100K MOTS benchmarks. The results clearly demonstrate the efficacy of our method by drastically narrowing the gap between fully and weakly-supervised VIS performance. Our code and trained models are available at http://vis.xyz/pub/maskfreevis.
引用
收藏
页码:22857 / 22866
页数:10
相关论文
共 50 条
  • [11] Mask-Attention-Free Transformer for 3D Instance Segmentation
    Lai, Xin
    Yuan, Yuhui
    Chu, Ruihang
    Chen, Yukang
    Hu, Han
    Jia, Jiaya
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3670 - 3680
  • [12] AFP-Mask: Anchor-Free Polyp Instance Segmentation in Colonoscopy
    Wang, Dechun
    Chen, Shuijiao
    Sun, Xinzi
    Chen, Qilei
    Cao, Yu
    Liu, Benyuan
    Liu, Xiaowei
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (07) : 2995 - 3006
  • [13] An anchor-free instance segmentation method for cells based on mask contour
    Chen, Qi
    Zhang, Huihuang
    Zhou, Qianwei
    Guan, Qiu
    Hu, Haigen
    APPLIED INTELLIGENCE, 2025, 55 (02)
  • [14] Measurement of oxygen uptake: Validation of a "mask-free" method
    Corazza, Ivan
    Fabbiani, Laura
    Zannoli, Romano
    PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS, 2007, 23 (01): : 41 - 47
  • [15] Mask encoding: A general instance mask representation for object segmentation
    Zhang, Rufeng
    Kong, Tao
    Wang, Xinlong
    You, Mingyu
    PATTERN RECOGNITION, 2022, 124
  • [16] GTMS: A Gradient-Driven Tree-Guided Mask-Free Referring Image Segmentation Method
    Lyu, Haoxin
    Zhong, Tianxiong
    Zhao, Sanyuan
    COMPUTER VISION - ECCV 2024, PT LXVI, 2025, 15124 : 288 - 304
  • [17] Mask encoding: A general instance mask representation for object segmentation
    Zhang, Rufeng
    Kong, Tao
    Wang, Xinlong
    You, Mingyu
    Pattern Recognition, 2022, 124
  • [18] Scalable, Detailed and Mask-Free Universal Photometric Stereo
    Ikehata, Satoshi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13198 - 13207
  • [19] Mask-free Iterative Refinement Network for weakly-supervised Few-shot Semantic Segmentation
    Chen, Shanjuan
    Yu, Yunlong
    Li, Yingming
    Lu, Ziqian
    Zhou, Yulin
    NEUROCOMPUTING, 2025, 611
  • [20] Instance Sequence Queries for Video Instance Segmentation with Transformers
    Xu, Zhujun
    Vivet, Damien
    SENSORS, 2021, 21 (13)