Advancing Referring Expression Segmentation Beyond Single Image

被引:3
|
作者
Wu, Yixuan [1 ,2 ]
Zhang, Zhao [1 ]
Xie, Chi [1 ,3 ]
Zhu, Feng [1 ]
Zhao, Rui [1 ,4 ]
机构
[1] SenseTime Res, Shanghai, Peoples R China
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Tongji Univ, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.00248
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Referring Expression Segmentation (RES) is a widely explored multi-modal task, which endeavors to segment the pre-existing object within a single image with a given linguistic expression. However, in broader real-world scenarios, it is not always possible to determine if the described object exists in a specific image. Generally, a collection of images is available, some of which potentially contain the target objects. To this end, we propose a more realistic setting, named Group-wise Referring Expression Segmentation (GRES), which expands RES to a group of related images, allowing the described objects to exist in a subset of the input image group. To support this new setting, we introduce an elaborately compiled dataset named Grouped Referring Dataset (GRD), containing complete group-wise annotations of the target objects described by given expressions. Moreover, we also present a baseline method named Grouped Referring Segmenter (GRSer), which explicitly captures the language- vision and intra-group vision-vision interactions to achieve state-of-the-art results on the proposed GRES setting and related tasks, such as Co-Salient Object Detection and traditional RES. Our dataset and codes are publicly released in https://github.com/shikras/d-cube.
引用
收藏
页码:2628 / 2638
页数:11
相关论文
共 50 条
  • [31] Referring Image Segmentation via Language-Driven Attention
    Chen, Ding-Jie
    Hsieh, He-Yen
    Liu, Tyng-Luh
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13997 - 14003
  • [32] Text-Vision Relationship Alignment for Referring Image Segmentation
    Pu, Mingxing
    Luo, Bing
    Zhang, Chao
    Xu, Li
    Xu, Fayou
    Kong, Mingming
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [33] Local-global coordination with transformers for referring image segmentation
    Liu, Fang
    Kong, Yuqiu
    Zhang, Lihe
    Feng, Guang
    Yin, Baocai
    [J]. NEUROCOMPUTING, 2023, 522 : 39 - 52
  • [34] Calibration & Reconstruction: Deep Integrated Language for Referring Image Segmentation
    Yan, Yichen
    He, Xingjian
    Chen, Sihan
    Liu, Jing
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 451 - 459
  • [35] Vision-Aware Language Reasoning for Referring Image Segmentation
    Xu, Fayou
    Luo, Bing
    Zhang, Chao
    Xu, Li
    Pu, Mingxing
    Li, Bo
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11313 - 11331
  • [36] See-Through-Text Grouping for Referring Image Segmentation
    Chen, Ding-Jie
    Jia, Songhao
    Lo, Yi-Chen
    Chen, Hwann-Tzong
    Liu, Tyng-Luh
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7453 - 7462
  • [37] Vision-Aware Language Reasoning for Referring Image Segmentation
    Fayou Xu
    Bing Luo
    Chao Zhang
    Li Xu
    Mingxing Pu
    Bo Li
    [J]. Neural Processing Letters, 2023, 55 : 11313 - 11331
  • [38] Comprehensive Multi-Modal Interactions for Referring Image Segmentation
    Jain, Kanishk
    Gandhi, Vineet
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3427 - 3435
  • [39] Global and Local Interactive Perception Network for Referring Image Segmentation
    Liu, Jing
    Tan, Hongchen
    Hu, Yongli
    Sun, Yanfeng
    Wang, Huasheng
    Yin, Baocai
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (12) : 1 - 14
  • [40] Bottom-Up Shift and Reasoning for Referring Image Segmentation
    Yang, Sibei
    Xia, Meng
    Li, Guanbin
    Zhou, Hong-Yu
    Yu, Yizhou
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11261 - 11270