Multi-Range View Aggregation Network With Vision Transformer Feature Fusion for 3D Object Retrieval

被引:7
|
作者
Lin, Dongyun [1 ]
Li, Yiqun [1 ]
Cheng, Yi [1 ]
Prasad, Shitala [1 ]
Guo, Aiyuan [1 ]
Cao, Yanpeng [2 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore
[2] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou 310027, Peoples R China
关键词
Three-dimensional displays; Feature extraction; Transformers; Convolutional neural networks; Visualization; Fuses; Deep learning; 3D object retrieval; multi-range view aggregation; multi-head self-attention; feature fusion; SIMILARITY; DIFFUSION;
D O I
10.1109/TMM.2023.3246229
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
View-based methods have achieved state-of-the-art performance in 3D object retrieval. However, view-based methods still encounter two major challenges. The first is how to leverage the inter-view correlation to enhance view-level visual features. The second is how to effectively fuse view-level features into a discriminative global descriptor. Towards these two challenges, we propose a multi-range view aggregation network (MRVA-Net) with a vision transformer based feature fusion scheme for 3D object retrieval. Unlike the existing methods which only consider aggregating neighboring or adjacent views which could bring in redundant information, we propose a multi-range view aggregation module to enhance individual view representations through view aggregation beyond only neighboring views but also incorporate the views at different ranges. Furthermore, to generate the global descriptor from view-level features, we propose to employ the multi-head self-attention mechanism introduced by vision transformer to fuse the view-level features. Extensive experiments conducted on three public datasets including ModelNet40, ShapeNet Core55 and MCB-A demonstrate the superiority of the proposed network over the state-of-the-art methods in 3D object retrieval.
引用
收藏
页码:9108 / 9119
页数:12
相关论文
共 50 条
  • [1] Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification
    Liu, An-An
    Hu, Nian
    Song, Dan
    Guo, Fu-Bin
    Zhou, He-Yu
    Hao, Tong
    IEEE ACCESS, 2019, 7 : 153021 - 153030
  • [2] Multi-view convolutional vision transformer for 3D object recognition
    Li, Jie
    Liu, Zhao
    Li, Li
    Lin, Junqin
    Yao, Jian
    Tu, Jingmin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [3] Multi-View 3D Object Retrieval With Deep Embedding Network
    Guo, Haiyun
    Wang, Jinqiao
    Gao, Yue
    Li, Jianqiang
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537
  • [4] Feature representation for 3D object retrieval based on unconstrained multi-view
    Bin Zhou
    Xuanyin Wang
    Multimedia Systems, 2022, 28 : 1699 - 1711
  • [5] Feature representation for 3D object retrieval based on unconstrained multi-view
    Zhou, Bin
    Wang, Xuanyin
    MULTIMEDIA SYSTEMS, 2022, 28 (05) : 1699 - 1711
  • [6] Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
    Lin, Dongyun
    Li, Yiqun
    Cheng, Yi
    Prasad, Shitala
    Nwe, Tin Lay
    Dong, Sheng
    Guo, Aiyuan
    KNOWLEDGE-BASED SYSTEMS, 2022, 247
  • [7] 3D Model Retrieval Based on Vision Feature Fusion
    Zhang, Mandun
    Ma, Yingshi
    Wang, Xiaofang
    Wei, Wei
    Xiao, Zhidong
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 905 - 909
  • [8] Multi-View Token Clustering and Fusion for 3D Object Recognition and Retrieval
    Fan, Linlong
    Ge, Yanqi
    Li, Wen
    Duan, Lixin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1145 - 1150
  • [9] Multi-View Joint Learning and BEV Feature-Fusion Network for 3D Object Detection
    Liu, Qunming
    Li, Xiaodong
    Zhang, Xiaofei
    Tan, Xiaojun
    Shi, Bodong
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [10] SKETCH-BASED 3D SHAPE RETRIEVAL WITH MULTI-VIEW FUSION TRANSFORMER
    Zhu, Cunjuan
    Cui, Dongdong
    Jia, Qi
    Wang, Weimin
    Liu, Yu
    Lew, Michael S.
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3005 - 3009