V2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}MLP: an accurate and simple multi-view MLP network for fine-grained 3D shape recognition

被引:0
|
作者
Liang Zheng [1 ]
Jing Bai [1 ]
Shaojin Bai [2 ]
Wenjing Li [1 ]
Bin Peng [1 ]
Tao Zhou [1 ]
机构
[1] North Minzu University,School of Computer Science and Engineering
[2] North Minzu University,The Key Laboratory of Images processing and Pattern Laboratory, Commission: IPPRLab
关键词
3D shape recognition; Fine-grained recognition; MLP; Multi-view; Cross-view;
D O I
10.1007/s00371-023-03191-4
中图分类号
学科分类号
摘要
Fine-grained 3D shape recognition (FGSR) is crucial for real-world applications. Existing methods face challenges in achieving high accuracy for FGSR due to high similarity within sub-categories and low dissimilarity between them, especially in the absence of part location or attribute annotations. In this paper, we propose V2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}MLP, a multi-view representation-oriented MLP network dedicated to FGSR, using only class labels as supervision. V2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}MLP comprises two key modules: the cross-view interaction MLP (CVI-MLP) and the cross-view fusion MLP (CVF-MLP). The CVI-MLP module captures contextual information, including local and global contexts through cross-view interactions, to extract discriminative view features that reinforce subtle differences between sub-categories. Meanwhile, the CVF-MLP module performs cross-view aggregation from space and view dimensions to obtain the final 3D shape features, minimizing information loss during the view feature fusion process. Extensive experiments on three categories from the FG3D dataset demonstrate the effectiveness of V2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}MLP in learning discriminative features for 3D shapes, achieving state-of-the-art accuracy for FGSR. Additionally, V2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}MLP performs competitively for meta-category recognition on the ModelNet40 dataset.
引用
收藏
页码:6655 / 6670
页数:15
相关论文
共 50 条