Coarse Mask Guided Interactive Object Segmentation

被引:3
|
作者
Li, Jing [1 ,2 ]
Fan, Junsong [3 ,4 ]
Wang, Yuxi [3 ,4 ]
Yang, Yuran [5 ]
Zhang, Zhaoxiang [4 ,6 ,7 ,8 ]
机构
[1] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100190, Peoples R China
[3] Chinese Acad Sci CASIA, Inst Automat, Ctr Res Intelligent Percept & Comp CRIPAC, Beijing 100190, Peoples R China
[4] HKISI CAS, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
[5] Tencent Maps, Beijing 100101, Peoples R China
[6] Chinese Acad Sci CASIA, Inst Automat, Beijing 100190, Peoples R China
[7] Univ Chinese Acad Sci UCAS, Sch Future Technol, Beijing 100049, Peoples R China
[8] State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Segmentation; interactive; transformer; annotation tool; RANDOM-WALKS; IMAGE; CUT;
D O I
10.1109/TIP.2023.3322564
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Interactive object segmentation aims to produce object masks with user interactions, such as clicks, bounding boxes, and scribbles. Click point is the most popular interactive cue for its efficiency, and related deep learning methods have attracted lots of interest in recent years. Most works encode click points as gaussian maps and concatenate them with images as the model's input. However, the spatial and semantic information of gaussian maps would be noised through multiple convolution layers and won't be fully exploited by top layers for mask prediction. To pass click information to top layers exactly and efficiently, we propose a coarse mask guided model (CMG) which predicts coarse masks with a coarse module to guide the object mask prediction. Specifically, the coarse module encodes user clicks as query features and enriches their semantic information with backbone features through transformer layers, coarse masks are generated based on the enriched query feature and fed into CMG's decoder. Benefiting from the efficiency of transformer, CMG's coarse module and decoder module are lightweight and computationally efficient, making the interaction process more smooth. Experiments on several segmentation benchmarks demonstrate the effectiveness of our method, and we get new state-of-the-art results compared with previous works.
引用
收藏
页码:5808 / 5822
页数:15
相关论文
共 50 条
  • [41] From global image annotation to interactive object segmentation
    Giro-i-Nieto, Xavier
    Martos, Manuel
    Mohedano, Eva
    Pont-Tuset, Jordi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (01) : 475 - 493
  • [42] Designing an interactive tool for video object segmentation and annotation
    Luo, HT
    Eleftheriadis, A
    ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 265 - 269
  • [43] From global image annotation to interactive object segmentation
    Xavier Giró-i-Nieto
    Manuel Martos
    Eva Mohedano
    Jordi Pont-Tuset
    Multimedia Tools and Applications, 2014, 70 : 475 - 493
  • [44] INTERACTIVE OBJECT SEGMENTATION IN HIGH RESOLUTION SATELLITE IMAGES
    Osman, J.
    Inglada, J.
    Christophe, E.
    2009 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-5, 2009, : 3473 - +
  • [45] Guided Co-Segmentation Network for Fast Video Object Segmentation
    Liu, Weide
    Lin, Guosheng
    Zhang, Tianyi
    Liu, Zichuan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (04) : 1607 - 1617
  • [46] The Segmentation Tracker With Mask-Guided Background Suppression Strategy
    Tian, Erlin
    Lei, Yunpeng
    Sun, Junfeng
    Zhou, Keyan
    Zhou, Bin
    Li, Hanfei
    IEEE ACCESS, 2024, 12 : 124032 - 124044
  • [47] Mask-guided SSD for small-object detection
    Sun, Chang
    Ai, Yibo
    Wang, Sheng
    Zhang, Weidong
    APPLIED INTELLIGENCE, 2021, 51 (06) : 3311 - 3322
  • [48] Mask-guided SSD for small-object detection
    Chang Sun
    Yibo Ai
    Sheng Wang
    Weidong Zhang
    Applied Intelligence, 2021, 51 : 3311 - 3322
  • [49] Segmentation mask guided end-to-end person search
    Zheng, Dingyuan
    Xiao, Jimin
    Huang, Kaizhu
    Zhao, Yao
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 86
  • [50] Efficient Mask Correction for Click-Based Interactive Image Segmentation
    Du, Fei
    Yuan, Jianlong
    Wang, Zhibin
    Wang, Fan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22773 - 22782