Advancing Referring Expression Segmentation Beyond Single Image

被引:3
|
作者
Wu, Yixuan [1 ,2 ]
Zhang, Zhao [1 ]
Xie, Chi [1 ,3 ]
Zhu, Feng [1 ]
Zhao, Rui [1 ,4 ]
机构
[1] SenseTime Res, Shanghai, Peoples R China
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Tongji Univ, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.00248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Referring Expression Segmentation (RES) is a widely explored multi-modal task, which endeavors to segment the pre-existing object within a single image with a given linguistic expression. However, in broader real-world scenarios, it is not always possible to determine if the described object exists in a specific image. Generally, a collection of images is available, some of which potentially contain the target objects. To this end, we propose a more realistic setting, named Group-wise Referring Expression Segmentation (GRES), which expands RES to a group of related images, allowing the described objects to exist in a subset of the input image group. To support this new setting, we introduce an elaborately compiled dataset named Grouped Referring Dataset (GRD), containing complete group-wise annotations of the target objects described by given expressions. Moreover, we also present a baseline method named Grouped Referring Segmenter (GRSer), which explicitly captures the language- vision and intra-group vision-vision interactions to achieve state-of-the-art results on the proposed GRES setting and related tasks, such as Co-Salient Object Detection and traditional RES. Our dataset and codes are publicly released in https://github.com/shikras/d-cube.
引用
收藏
页码:2628 / 2638
页数:11
相关论文
共 50 条
  • [1] Image Segmentation With Language Referring Expression and Comprehension
    Sun, Jiaxing
    Li, Yujie
    Cai, Jintong
    Lu, Huimin
    Serikawa, Seiichi
    [J]. IEEE SENSORS JOURNAL, 2022, 22 (18) : 17406 - 17413
  • [2] Beyond One-to-One: Rethinking the Referring Image Segmentation
    Hu, Yutao
    Wang, Qixiong
    Shao, Wenqi
    Xie, Enze
    Li, Zhenguo
    Han, Jungong
    Luo, Ping
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4044 - 4054
  • [3] Query Reconstruction Network for Referring Expression Image Segmentation
    Shi, Hengcan
    Li, Hongliang
    Wu, Qingbo
    Ngan, King Ngi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 995 - 1007
  • [4] Key-Word-Aware Network for Referring Expression Image Segmentation
    Shi, Hengcan
    Li, Hongliang
    Meng, Fanman
    Wu, Qingbo
    [J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 38 - 54
  • [5] Hierarchical collaboration for referring image segmentation
    Zhang, Wei
    Cheng, Zesen
    Chen, Jie
    Gao, Wen
    [J]. Neurocomputing, 2025, 613
  • [6] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    [J]. IEEE Transactions on Image Processing, 2024, 33 : 1782 - 1794
  • [7] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1782 - 1794
  • [8] GRES: Generalized Referring Expression Segmentation
    Liu, Chang
    Ding, Henghui
    Jiang, Xudong
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23592 - 23601
  • [9] Meta Compositional Referring Expression Segmentation
    Xu, Li
    Huang, Mark He
    Shang, Xindi
    Yuan, Zehuan
    Sun, Ying
    Liu, Jun
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19478 - 19487
  • [10] RRSIS: Referring Remote Sensing Image Segmentation
    Yuan, Zhenghang
    Mou, Lichao
    Hua, Yuansheng
    Zhu, Xiao Xiang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12