DESIGN AND RESEARCH OF A MULTI-VIEW GRAPH DEEP LEARNING 3D MODEL RETRIEVAL SYSTEM BASED ON FUSION VISION-TRANSFORMER

被引:0
|
作者
Liang, Rong [1 ]
Li, Fangping [1 ]
机构
[1] Taiyuan Univ, Dept Art & Design, 7 Fendong St,Tanghuai Ind Pk, Taiyuan 030000, Peoples R China
关键词
Vision-Transformer; Multi-perspective graph convolutional neural network; 3D; Perspective image; Image entropy;
D O I
10.24507/ijicic.20.06.1775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The development of computer vision has made three-dimensional models play a crucial role in the field of image processing. However, compared to 2D models, 3D models have more features, making it difficult to extract features and mine correlation information between features. Based on this, this study is based on a multi-perspective graph convolutional neural network, which uses an image entropy weight pooling layer to improve the original view pooling layer. It assigns a weight based on image entropy to each perspective image, and then performs view pooling operations. The Vision-Transformer module is embedded into a multi-perspective graph convolutional neural network to mine information associations between multi-view graphs. The results show that the multi- perspective graph convolutional neural network model fused with Vision-Transformer is more concentrated in classifying features of the same category in the view graph, and there is a significant distance difference between different features. The multi-perspective graph convolutional neural network model fused with Vision-Transformer achieves accuracy of 89.0%, 92.0%, 94.0%, and mean average precision values of 80.0%, 85.0%, and 88.0% when the number of view images is 6, 10, and 14. This study improves the retrieval accuracy of 3D models and has certain reference value in the field of computer vision.
引用
收藏
页码:1775 / 1788
页数:14
相关论文
共 50 条
  • [41] MULTI-VIEW 3D RECONSTRUCTION FROM VIDEO WITH TRANSFORMER
    Zhong, Yijie
    Sun, Zhengxing
    Sun, Yunhan
    Luo, Shoutong
    Wang, Yi
    Zhang, Wei
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1661 - 1665
  • [42] Unsupervised Feature Learning With Graph Embedding for View-Based 3D Model Retrieval
    Su, Yuting
    Li, Wenhui
    Nie, Weizhi
    Song, Dan
    Liu, An-An
    IEEE ACCESS, 2019, 7 : 95285 - 95296
  • [43] View-based 3D model retrieval with probabilistic graph model
    Gao, Yue
    Tang, Jinhui
    Li, Haojie
    Dai, Qionghai
    Zhang, Naiyao
    NEUROCOMPUTING, 2010, 73 (10-12) : 1900 - 1905
  • [44] 3D object retrieval based on multi-view convolutional neural networks
    Xi-Xi Li
    Qun Cao
    Sha Wei
    Multimedia Tools and Applications, 2017, 76 : 20111 - 20124
  • [45] Feature representation for 3D object retrieval based on unconstrained multi-view
    Bin Zhou
    Xuanyin Wang
    Multimedia Systems, 2022, 28 : 1699 - 1711
  • [46] Feature representation for 3D object retrieval based on unconstrained multi-view
    Zhou, Bin
    Wang, Xuanyin
    MULTIMEDIA SYSTEMS, 2022, 28 (05) : 1699 - 1711
  • [47] 3D object retrieval based on multi-view convolutional neural networks
    Li, Xi-Xi
    Cao, Qun
    Wei, Sha
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (19) : 20111 - 20124
  • [48] Learning-Based Bipartite Graph Matching for View-Based 3D Model Retrieval
    Lu, Ke
    Ji, Rongrong
    Tang, Jinhui
    Gao, Yue
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (10) : 4553 - 4563
  • [49] A Compact Multi-View Descriptor for 3D Object Retrieval
    Daras, Petros
    Axenopoulos, Apostolos
    CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
  • [50] Cross-view Transformer for enhanced multi-view 3D reconstruction
    Shi, Wuzhen
    Yin, Aixue
    Li, Yingxiang
    Qian, Bo
    VISUAL COMPUTER, 2024,