MUVA: A New Large-Scale Benchmark for Multi-view Amodal Instance Segmentation in the Shopping Scenario

被引:1
|
作者
Li, Zhixuan [1 ]
Ye, Weining [1 ]
Terven, Juan [2 ]
Bennett, Zachary [2 ]
Zheng, Ying [2 ]
Jiang, Tingting [1 ]
Huang, Tiejun [1 ,3 ]
机构
[1] Peking Univ, Sch Comp Sci, Natl Engn Res Ctr Visual Technol, Natl Key Lab Multimedia Informat Proc, Beijing 100871, Peoples R China
[2] AiFi Inc, Santa Clara, CA 94010 USA
[3] Beijing Acad Artificial Intelligence, Beijing 100084, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.02148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Amodal Instance Segmentation (AIS) endeavors to accurately deduce complete object shapes that are partially or fully occluded. However, the inherent ill-posed nature of single-view datasets poses challenges in determining occluded shapes. A multi-view framework may help alleviate this problem, as humans often adjust their perspective when encountering occluded objects. At present, this approach has not yet been explored by existing methods and datasets. To bridge this gap, we propose a new task called Multi-view Amodal Instance Segmentation (MAIS) and introduce the MUVA dataset, the first MUlti-View AIS dataset that takes the shopping scenario as instantiation. MUVA provides comprehensive annotations, including multi-view amodal/visible segmentation masks, 3D models, and depth maps, making it the largest image-level AIS dataset in terms of both the number of images and instances. Additionally, we propose a new method for aggregating representative features across different instances and views, which demonstrates promising results in accurately predicting occluded objects from one viewpoint by leveraging information from other viewpoints. Besides, we also demonstrate that MUVA can benefit the AIS task in real-world scenarios. (1)
引用
收藏
页码:23447 / 23456
页数:10
相关论文
共 50 条
  • [1] Joint Camera Clustering and Surface Segmentation for Large-scale Multi-view Stereo
    Zhang, Runze
    Li, Shiwei
    Fang, Tian
    Zhu, Siyu
    Quan, Long
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2084 - 2092
  • [2] MVImgNet: A Large-scale Dataset of Multi-view Images
    Yu, Xianggang
    Xu, Mutian
    Zhang, Yidan
    Liu, Haolin
    Ye, Chongjie
    Wu, Yushuang
    Yan, Zizheng
    Zhu, Chenming
    Xiong, Zhangyang
    Liang, Tianyou
    Chen, Guanying
    Cui, Shuguang
    Han, Xiaoguang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9150 - 9161
  • [3] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation
    Robert, Damien
    Vallet, Bruno
    Landrieu, Loic
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5565 - 5574
  • [4] UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation
    Yang, Guoqing
    Xue, Fuyou
    Zhang, Qi
    Xie, Ke
    Fu, Chi-Wing
    Huang, Hui
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [5] Large-Scale Multi-View Subspace Clustering in Linear Time
    Kang, Zhao
    Zhou, Wangtao
    Zhao, Zhitong
    Shao, Junming
    Han, Meng
    Xu, Zenglin
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4412 - 4419
  • [6] A Large-Scale Benchmark for Food Image Segmentation
    Wu, Xiongwei
    Fu, Xin
    Liu, Ying
    Lim, Ee-Peng
    Hoi, Steven C. H.
    Sun, Qianru
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 506 - 515
  • [7] Learning the consensus and complementary information for large-scale multi-view clustering
    Liu, Maoshan
    Palade, Vasile
    Zheng, Zhonglong
    NEURAL NETWORKS, 2024, 172
  • [8] Deep Collaborative Multi-View Hashing for Large-Scale Image Search
    Zhu, Lei
    Lu, Xu
    Cheng, Zhiyong
    Li, Jingjing
    Zhang, Huaxiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4643 - 4655
  • [9] Triplets-based large-scale multi-view spectral clustering
    Yang, Tianchuan
    Wang, Chang-Dong
    Guo, Jipeng
    Li, Xiangcheng
    Chen, Man-Sheng
    Dang, Shuping
    Chen, Haiqiang
    INFORMATION FUSION, 2025, 121
  • [10] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
    Yao, Yao
    Luo, Zixin
    Li, Shiwei
    Zhang, Jingyang
    Ren, Yufan
    Zhou, Lei
    Fang, Tian
    Quan, Long
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1787 - 1796