A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

被引：2

作者：

Zhu, Chaoyang ^{[1
]}

Chen, Long ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

关键词：

Open-vocabulary; zero-shot learning; object detection; image segmentation; future directions; OBJECT; LANGUAGE;

D O I：

10.1109/TPAMI.2024.3413013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As the most fundamental scene understanding tasks, object detection and segmentation have made tremendous progress in deep learning era. Due to the expensive manual labeling cost, the annotated categories in existing datasets are often small-scale and pre-defined, i.e., state-of-the-art fully-supervised detectors and segmentors fail to generalize beyond the closed vocabulary. To resolve this limitation, in the last few years, the community has witnessed an increasing attention toward Open-Vocabulary Detection (OVD) and Segmentation (OVS). By "open-vocabulary", we mean that the models can classify objects beyond pre-defined categories. In this survey, we provide a comprehensive review on recent developments of OVD and OVS. A taxonomy is first developed to organize different tasks and methodologies. We find that the permission and usage of weak supervision signals can well discriminate different methodologies, including: visual-semantic space mapping, novel visual feature synthesis, region-aware training, pseudo-labeling, knowledge distillation, and transfer learning. The proposed taxonomy is universal across different tasks, covering object detection, semantic/instance/panoptic segmentation, 3D and video understanding. The main design principles, key challenges, development routes, methodology strengths, and weaknesses are thoroughly analyzed.

引用

页码：8954 / 8975

页数：22

共 50 条

[21] Global Knowledge Calibration for Fast Open-Vocabulary Segmentation
Han, Kunyang
Liu, Yong
Liew, Jun Hao
Ding, Henghui
Liu, Jiajun
Wang, Yitong
Tang, Yansong
Yang, Yujiu
Feng, Jiashi
Zhao, Yao
Wei, Yunchao
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 797 - 807
[22] Open-Vocabulary Part-Level Detection and Segmentation for Human-Robot Interaction
Yang, Shan
Liu, Xiongding
Wei, Wu
APPLIED SCIENCES-BASEL, 2024, 14 (14):
[23] Open-Vocabulary Object Detection Using Captions
Zareian, Alireza
Dela Rosa, Kevin
Hu, Derek Hao
Chang, Shih-Fu
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14388 - 14397
[24] Weakly Supervised Open-Vocabulary Object Detection
Lin, Jianghang
Shen, Yunhang
Wang, Bingquan
Lin, Shaohui
Li, Ke
Cao, Liujuan
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3404 - 3412
[25] In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
Kang, Dahyun
Cho, Minsu
COMPUTER VISION - ECCV 2024, PT XLI, 2025, 15099 : 143 - 164
[26] Open-Vocabulary Instance Segmentation-Boundary IS-Goal
Tang, Quan
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT IV, 2025, 15034 : 420 - 435
[27] USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
Wang, Xiaoqi
He, Wenbin
Xuan, Xiwei
Sebastian, Clint
Ono, Jorge Piazentin
Li, Xin
Behpour, Sima
Thang Doan
Gou, Liang
Shen, Han-Wei
Ren, Liu
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4187 - 4196
[28] Weakly Supervised 3D Open-vocabulary Segmentation
Liu, Kunhao
Zhan, Fangneng
Zhang, Jiahui
Xu, Muyu
Yu, Yingchen
El Saddik, Abdulmotaleb
Theobalt, Christian
Xing, Eric
Lu, Shijian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[29] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
Lan, Mengcheng
Chen, Chaofeng
Ke, Yiping
Wang, Xinjiang
Feng, Litong
Zhang, Wayne
COMPUTER VISION - ECCV 2024, PT XXXVII, 2025, 15095 : 70 - 88
[30] Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation
Zhang, Fei
Zhou, Tianfei
Li, Boyang
He, Hao
Ma, Chaofan
Zhang, Tianjiao
Yao, Jiangchao
Zhang, Ya
Wang, Yanfeng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →