MVImgNet: A Large-scale Dataset of Multi-view Images

被引:7
|
作者
Yu, Xianggang [1 ,2 ]
Xu, Mutian [1 ,2 ]
Zhang, Yidan [1 ,2 ]
Liu, Haolin [1 ,2 ]
Ye, Chongjie [1 ,2 ]
Wu, Yushuang [1 ,2 ]
Yan, Zizheng [1 ,2 ]
Zhu, Chenming [1 ,2 ]
Xiong, Zhangyang [1 ,2 ]
Liang, Tianyou [1 ,2 ]
Chen, Guanying [1 ,2 ]
Cui, Shuguang [1 ,2 ]
Han, Xiaoguang [1 ,2 ]
机构
[1] CUHKSZ, FNii, Shenzhen, Peoples R China
[2] CUHKSZ, SSE, Shenzhen, Peoples R China
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00883
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth of ImageNet [24] drives a remarkable trend of 'learning from large-scale data' in computer vision. Pretraining on ImageNet to obtain rich universal representations has been manifested to benefit various 2D visual tasks, and becomes a standard in 2D vision. However, due to the laborious collection of real-world 3D data, there is yet no generic dataset serving as a counterpart of ImageNet in 3D vision, thus how such a dataset can impact the 3D community is unraveled. To remedy this defect, we introduce MVImgNet, a large-scale dataset of multi-view images, which is highly convenient to gain by shooting videos of real-world objects in human daily life. It contains 6.5 million frames from 219,188 videos crossing objects from 238 classes, with rich annotations of object masks, camera parameters, and point clouds. The multi-view attribute endows our dataset with 3D-aware signals, making it a soft bridge between 2D and 3D vision. We conduct pilot studies for probing the potential of MVImgNet on a variety of 3D and 2D visual tasks, including radiance field reconstruction, multi-view stereo, and view-consistent image understanding, where MVImgNet demonstrates promising performance, remaining lots of possibilities for future explorations. Besides, via dense reconstruction on MVImgNet, a 3D object point cloud dataset is derived, called MVPNet, covering 87,200 samples from 150 categories, with the class label on each point cloud. Experiments show that MVPNet can benefit the real-world 3D object classification while posing new challenges to point cloud understanding. MVImgNet and MVPNet will be public, hoping to inspire the broader vision community.
引用
下载
收藏
页码:9150 / 9161
页数:12
相关论文
共 50 条
  • [31] Large-scale multi-view subspace clustering via embedding space and partition matrix
    Cheng, Tianhang
    Peng, Jinjia
    Li, Hui
    Wang, Huibing
    NEUROCOMPUTING, 2024, 602
  • [32] Gait Analysis of Gender and Age Using a Large-Scale Multi-view Gait Database
    Makihara, Yasushi
    Mannami, Hidetoshi
    Yagi, Yasushi
    COMPUTER VISION - ACCV 2010, PT II, 2011, 6493 : 440 - 451
  • [33] Tensor-Derived Large-Scale Multi-View Subspace Clustering With Faithful Semantics
    Huang, Sujia
    Du, Shide
    Fu, Lele
    Wu, Zhihao
    Wang, Shiping
    IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2024, 10 : 584 - 598
  • [34] Joint Multi-View Hashing for Large-Scale Near-Duplicate Video Retrieval
    Nie, Xiushan
    Jing, Weizhen
    Cui, Chaoran
    Zhang, Chen Jason
    Zhu, Lei
    Yin, Yilong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (10) : 1951 - 1965
  • [35] Edge aware depth inference for large-scale aerial building multi-view stereo
    Zhang, Song
    Wei, Zhiwei
    Xu, Wenjia
    Zhang, Lili
    Wang, Yang
    Zhang, Jinming
    Liu, Junyi
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 207 : 27 - 42
  • [36] Design and evaluation of a large-scale autostereoscopic multi-view laser display for outdoor applications
    Reitterer, Joerg
    Fidler, Franz
    Schmid, Gerhard
    Riel, Thomas
    Hambeck, Christian
    Saint Julien-Wallsee, Ferdinand
    Leeb, Walter
    Schmid, Ulrich
    OPTICS EXPRESS, 2014, 22 (22): : 27063 - 27068
  • [37] Align then Fusion: Generalized Large-scale Multi-view Clustering with Anchor Matching Correspondences
    Wang, Siwei
    Liu, Xinwang
    Liu, Suyuan
    Jin, Jiaqi
    Tu, Wenxuan
    Zhu, Xinzhong
    Zhu, En
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] Semi-supervised multi-view binary learning for large-scale image clustering
    Mingyang Liu
    Zuyuan Yang
    Wei Han
    Junhang Chen
    Weijun Sun
    Applied Intelligence, 2022, 52 : 14853 - 14870
  • [39] Semi-supervised multi-view binary learning for large-scale image clustering
    Liu, Mingyang
    Yang, Zuyuan
    Han, Wei
    Chen, Junhang
    Sun, Weijun
    APPLIED INTELLIGENCE, 2022, 52 (13) : 14853 - 14870
  • [40] High completeness multi-view stereo for dense reconstruction of large-scale urban scenes
    Liao, Yongjian
    Zhang, Xuexi
    Huang, Nan
    Fu, Chuanyu
    Huang, Zijie
    Cao, Qiku
    Xu, Zexi
    Xiong, Xiaoming
    Cai, Shuting
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 209 : 173 - 196