Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

被引:54
|
作者
Li, Feng [1 ,3 ]
Zhang, Hao [1 ,3 ]
Xu, Huaizhe [1 ,3 ]
Liu, Shilong [2 ,3 ]
Zhang, Lei [3 ]
Ni, Lionel M. [1 ,4 ]
Shum, Heimg-Yeung [1 ,3 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Tsinghua Univ, Inst AI, Dept CST, BNIZist Ctr, Beijing, Peoples R China
[3] Int Digital Econ Acad IDEA, Shenzhen, Peoples R China
[4] Hong Kong Univ Sci & Technol, Guangzhou, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.00297
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present Mask DINO, a unified object detection and segmentation framework. Mask DINO extends DINO (DETR with Improved Denoising Anchor Boxes) by adding a mask prediction branch which supports all image segmentation tasks (instance, panoptic, and semantic). It makes use of the query embeddings from DINO to dot-product a high-resolution pixel embedding map to predict a set of binary masks. Some key components in DINO are extended for segmentation through a shared architecture and training process. Mask DINO is simple, efficient, and scalable, and it can benefit from joint large-scale detection and segmentation datasets. Our experiments show that Mask DINO significantly outperforms all existing specialized segmentation methods, both on a ResNet-50 backbone and a pre-trained model with SwinL backbone. Notably, Mask DINO establishes the best results to date on instance segmentation (54.5 AP on COCO), panoptic segmentation (59.4 PQ on COCO), and semantic segmentation (60.8 mIoU on ADE20K) among models under one billion parameters. Code is available at https://github. com/ IDEAResearch/MaskDINO.
引用
收藏
页码:3041 / 3050
页数:10
相关论文
共 50 条
  • [1] A Transformer-Based Framework for Tiny Object Detection
    Liao, Yi-Kai
    Lin, Gong-Si
    Yeh, Mei-Chen
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 373 - 377
  • [2] A transformer-based mask R-CNN for tomato detection and segmentation
    Wang, Chong
    Yang, Gongping
    Huang, Yuwen
    Liu, Yikun
    Zhang, Yan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (05) : 8585 - 8595
  • [3] UniTR: A Unified TRansformer-Based Framework for Co-Object and Multi-Modal Saliency Detection
    Guo, Ruohao
    Ying, Xianghua
    Qi, Yanyu
    Qu, Liao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7622 - 7635
  • [4] Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection
    He, Tao
    Gao, Lianli
    Song, Jingkuan
    Li, Yuan-Fang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6274 - 6288
  • [5] Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
    Cui, Yiming
    Yang, Linjie
    Yu, Haichao
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [6] Towards Unified Object Detection and Semantic Segmentation
    Dong, Jian
    Chen, Qiang
    Yan, Shuicheng
    Yuille, Alan
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 299 - 314
  • [7] Survey of Transformer-Based Object Detection Algorithms
    Li, Jian
    Du, Jianqiang
    Zhu, Yanchen
    Guo, Yongkun
    [J]. Computer Engineering and Applications, 2023, 59 (10): : 48 - 64
  • [8] A Unified Transformer Framework for Group-Based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection
    Su, Yukun
    Deng, Jingliang
    Sun, Ruizhou
    Lin, Guosheng
    Su, Hanjing
    Wu, Qingyao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 313 - 325
  • [9] An Improved Swin Transformer-Based Model for Remote Sensing Object Detection and Instance Segmentation
    Xu, Xiangkai
    Feng, Zhejun
    Cao, Changqing
    Li, Mengyuan
    Wu, Jin
    Wu, Zengyan
    Shang, Yajie
    Ye, Shubing
    [J]. REMOTE SENSING, 2021, 13 (23)
  • [10] Towards Transformer-Based Real-Time Object Detection at the Edge: A Benchmarking Study
    Samplawski, Colin
    Marlin, Benjamin M.
    [J]. 2021 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2021), 2021,