TAG: Guidance-Free Open-Vocabulary Semantic Segmentation

被引：0

作者：

Kawano, Yasufumi ^{[1
]}

Aoki, Yoshimitsu ^{[1
]}

机构：

[1] Keio Univ, Grad Sch Integrated Design Engn, Yokohama, Kanagawa 2238522, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Semantic segmentation; Training; Databases; Annotations; Task analysis; Semantics; Vocabulary; Computer vision; Classification algorithms; open-vocabulary; zero-guidance;

D O I：

10.1109/ACCESS.2024.3418210

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Semantic segmentation is a crucial task in computer vision, where each pixel in an image is classified into a category. However, traditional methods face significant challenges, including the need for pixel-level annotations and extensive training. Furthermore, because supervised learning uses a limited set of predefined categories, models typically struggle with rare classes and cannot recognize new ones. Unsupervised and open-vocabulary segmentation, proposed to tackle these issues, faces challenges, including the inability to assign specific class labels to clusters and the necessity of user-provided text queries for guidance. In this context, we propose a novel approach, TAG which achieves Training, Annotation, and Guidance-free open-vocabulary semantic segmentation. TAG utilizes pre-trained models such as CLIP and DINO to segment images into meaningful categories without additional training or dense annotations. It retrieves class labels from an external database, providing flexibility to adapt to new scenarios. Our TAG achieves state-of-the-art results on PascalVOC, PascalContext and ADE20K for open-vocabulary segmentation without given class names, i.e. improvement of +15.3 mIoU on PascalVOC.

引用

页码：88322 / 88331

页数：10

共 50 条

[1] Side Adapter Network for Open-Vocabulary Semantic Segmentation
Xu, Mengde
Zhang, Zheng
Wei, Fangyun
Hu, Han
Bai, Xiang
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2945 - 2954
[2] Enhancing Open-Vocabulary Semantic Segmentation with Prototype Retrieval
Barsellotti, Luca
Amoroso, Roberto
Baraldi, Lorenzo
Cucchiara, Rita
[J]. IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 196 - 208
[3] Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Liang, Feng
Wu, Bichen
Dai, Xiaoliang
Li, Kunpeng
Zhao, Yinan
Zhang, Hang
Zhang, Peizhao
Vajda, Peter
Marculescu, Diana
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7061 - 7070
[4] Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation
Zhang, Fei
Zhou, Tianfei
Li, Boyang
He, Hao
Ma, Chaofan
Zhang, Tianjiao
Yao, Jiangchao
Zhang, Ya
Wang, Yanfeng
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation
Shi, Hengcan
Dao, Son Duy
Cai, Jianfei
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
[6] SAN: Side Adapter Network for Open-Vocabulary Semantic Segmentation
Xu, Mengde
Zhang, Zheng
Wei, Fangyun
Hu, Han
Bai, Xiang
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15546 - 15561
[7] Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network
Han, Cong
Zhong, Yujie
Li, Dengjie
Han, Kai
Ma, Lin
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 1086 - 1096
[8] Class Enhancement Losses With Pseudo Labels for Open-Vocabulary Semantic Segmentation
Dao, Son Duy
Shi, Hengcan
Phung, Dinh
Cai, Jianfei
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8442 - 8453
[9] Open-Vocabulary And Multitask Image Segmentation
Pan, Lihu
Yang, Yunting
Wang, Zhengkui
Shan, Wen
Yin, Jaili
[J]. 39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 1048 - 1049
[10] Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
Xu, Jilan
Hou, Junlin
Zhang, Yuejie
Feng, Rui
Wang, Yi
Qiao, Yu
Xie, Weidi
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2935 - 2944

← 1 2 3 4 5 →