DESIGN AND RESEARCH OF A MULTI-VIEW GRAPH DEEP LEARNING 3D MODEL RETRIEVAL SYSTEM BASED ON FUSION VISION-TRANSFORMER

被引:0
|
作者
Liang, Rong [1 ]
Li, Fangping [1 ]
机构
[1] Taiyuan Univ, Dept Art & Design, 7 Fendong St,Tanghuai Ind Pk, Taiyuan 030000, Peoples R China
关键词
Vision-Transformer; Multi-perspective graph convolutional neural network; 3D; Perspective image; Image entropy;
D O I
10.24507/ijicic.20.06.1775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The development of computer vision has made three-dimensional models play a crucial role in the field of image processing. However, compared to 2D models, 3D models have more features, making it difficult to extract features and mine correlation information between features. Based on this, this study is based on a multi-perspective graph convolutional neural network, which uses an image entropy weight pooling layer to improve the original view pooling layer. It assigns a weight based on image entropy to each perspective image, and then performs view pooling operations. The Vision-Transformer module is embedded into a multi-perspective graph convolutional neural network to mine information associations between multi-view graphs. The results show that the multi- perspective graph convolutional neural network model fused with Vision-Transformer is more concentrated in classifying features of the same category in the view graph, and there is a significant distance difference between different features. The multi-perspective graph convolutional neural network model fused with Vision-Transformer achieves accuracy of 89.0%, 92.0%, 94.0%, and mean average precision values of 80.0%, 85.0%, and 88.0% when the number of view images is 6, 10, and 14. This study improves the retrieval accuracy of 3D models and has certain reference value in the field of computer vision.
引用
收藏
页码:1775 / 1788
页数:14
相关论文
共 50 条
  • [21] Multi-View Token Clustering and Fusion for 3D Object Recognition and Retrieval
    Fan, Linlong
    Ge, Yanqi
    Li, Wen
    Duan, Lixin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1145 - 1150
  • [22] 3D model retrieval based on multi-view attentional convolutional neural network
    Liu, An-An
    Zhou, He-Yu
    Li, Meng-Jie
    Nie, Wei-Zhi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (7-8) : 4699 - 4711
  • [23] 3D model retrieval based on multi-view attentional convolutional neural network
    An-An Liu
    He-Yu Zhou
    Meng-Jie Li
    Wei-Zhi Nie
    Multimedia Tools and Applications, 2020, 79 : 4699 - 4711
  • [24] Learning View-Based Graph Convolutional Network for Multi-View 3D Shape Analysis
    Wei, Xin
    Yu, Ruixuan
    Sun, Jian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7525 - 7541
  • [25] Exploring Deep Learning for View-Based 3D Model Retrieval
    Gao, Zan
    Li, Yinming
    Wan, Shaohua
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
  • [26] Multi-View Tree Structure Learning for 3D Model Retrieval and Classification in Smart City
    Liu, An-An
    Zhao, Zhenlan
    Li, Wenhui
    Song, Dan
    IEEE ACCESS, 2020, 8 : 129743 - 129753
  • [27] Multi-View Transformer for 3D Visual Grounding
    Huang, Shijia
    Chen, Yilun
    Jia, Jiaya
    Wang, Liwei
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15503 - 15512
  • [28] Multi-view 3D reconstruction based on deep learning: A survey and comparison of methods
    Wu, Juhao
    Wyman, Omar
    Tang, Yadong
    Pasini, Damiano
    Wang, Wenlong
    Neurocomputing, 2024, 582
  • [29] 3D Model Retrieval Based on Vision Feature Fusion
    Zhang, Mandun
    Ma, Yingshi
    Wang, Xiaofang
    Wei, Wei
    Xiao, Zhidong
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 905 - 909
  • [30] Multi-view 3D reconstruction based on deep learning: A survey and comparison of methods
    Wu, Juhao
    Wyman, Omar
    Tang, Yadong
    Pasini, Damiano
    Wang, Wenlong
    NEUROCOMPUTING, 2024, 582