Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

被引:11
|
作者
Lin, Dongyun [1 ]
Li, Yiqun [1 ]
Cheng, Yi [1 ]
Prasad, Shitala [1 ]
Nwe, Tin Lay [1 ]
Dong, Sheng [1 ]
Guo, Aiyuan [1 ]
机构
[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way,21-01 Connexis South Tower, Singapore 138632, Singapore
关键词
View-based 3D object retrieval; View attention module; Instance attention module; ArcFace loss; Cosine distance triplet -center loss; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.1016/j.knosys.2022.108754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self -attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. (C)& nbsp;2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A Compact Multi-View Descriptor for 3D Object Retrieval
    Daras, Petros
    Axenopoulos, Apostolos
    [J]. CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
  • [2] Multi-View 3D Object Retrieval With Deep Embedding Network
    Guo, Haiyun
    Wang, Jinqiao
    Gao, Yue
    Li, Jianqiang
    Lu, Hanqing
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) : 5526 - 5537
  • [3] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Gao, Zan
    Xue, Kai-Xin
    Zhang, Hua
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) : 555 - 572
  • [4] Multi-view and multivariate gaussian descriptor for 3D object retrieval
    Zan Gao
    Kai-Xin Xue
    Hua Zhang
    [J]. Multimedia Tools and Applications, 2019, 78 : 555 - 572
  • [5] Emphasizing 3D Properties in Recurrent Multi-View Aggregation for 3D Shape Retrieval
    Xu, Cheng
    Leng, Biao
    Zhang, Cheng
    Zhou, Xiaochen
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7428 - 7435
  • [6] 3D Object Retrieval Based on Multi-View Latent Variable Model
    Liu, An-An
    Nie, Wei-Zhi
    Su, Yu-Ting
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (03) : 868 - 880
  • [7] Multi-View Hierarchical Fusion Network for 3D Object Retrieval and Classification
    Liu, An-An
    Hu, Nian
    Song, Dan
    Guo, Fu-Bin
    Zhou, He-Yu
    Hao, Tong
    [J]. IEEE ACCESS, 2019, 7 : 153021 - 153030
  • [8] 3D object retrieval based on multi-view convolutional neural networks
    Xi-Xi Li
    Qun Cao
    Sha Wei
    [J]. Multimedia Tools and Applications, 2017, 76 : 20111 - 20124
  • [9] Hierarchical multi-view context modelling for 3D object classification and retrieval
    Liu, An-An
    Zhou, Heyu
    Nie, Weizhi
    Liu, Zhenguang
    Liu, Wu
    Xie, Hongtao
    Mao, Zhendong
    Li, Xuanya
    Song, Dan
    [J]. INFORMATION SCIENCES, 2021, 547 : 984 - 995
  • [10] Feature representation for 3D object retrieval based on unconstrained multi-view
    Bin Zhou
    Xuanyin Wang
    [J]. Multimedia Systems, 2022, 28 : 1699 - 1711