User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning

被引:13
|
作者
Tan, Min [1 ]
Yu, Jun [1 ]
Yu, Zhou [1 ]
Gao, Fei [1 ]
Rui, Yong [2 ]
Tao, Dacheng [3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, 1158,2nd Ave, Hangzhou 310018, Zhejiang, Peoples R China
[2] Lenovo, 6 Shang Di West Rd, Beijing 100085, Peoples R China
[3] Univ Sydney, Fac Engn & Informat Technol, Sydney, NSW, Australia
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Metric learning; fine-grained image recognition; user click data; convolutional neural network; weakly supervised learning; FEATURES; CONSTRAINTS;
D O I
10.1145/3209666
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a novel fine-grained image recognition framework using user click data, which can bridge the semantic gap in distinguishing categories that are similar in visual. As query set in click data is usually large-scale and redundant, we first propose a click-feature-based query-merging approach to merge queries with similar semantics and construct a compact click feature. Afterward, we utilize this compact click feature and convolutional neural network (CNN)-based deep visual feature to jointly represent an image. Finally, with the combined feature, we employ the metriclearning-based template-matching scheme for efficient recognition. Considering the heavy noise in the training data, we introduce a reliability variable to characterize the image reliability, and propose a weakly-supervised metric and template leaning with smooth assumption and click prior (WMTLSC) method to jointly learn the distance metric, object templates, and image reliability. Extensive experiments are conducted on a public Clickture-Dog dataset and our newly established Clickture-Bird dataset. It is shown that the click-data-based query merging helps generating a highly compact (the dimension is reduced to 0.9%) and dense click feature for images, which greatly improves the computational efficiency. Also, introducing this click feature into CNN feature further boosts the recognition accuracy. The proposed framework performs much better than previous state-of-the-arts in fine-grained recognition tasks.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition
    Yu, Jun
    Tan, Min
    Zhang, Hongyuan
    Tao, Dacheng
    Rui, Yong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 563 - 578
  • [22] Weakly supervised fine-grained recognition based on spatial-channel aware attention filters
    Yu, Nannan
    Huang, Lei
    Wei, Zhiqiang
    Zhang, Wenfeng
    Wang, Bin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (09) : 14409 - 14427
  • [23] A weakly supervised spatial group attention network for fine-grained visual recognition
    Xie, Jiangjian
    Zhong, Yujie
    Zhang, Junguo
    Zhang, Changchun
    Schuller, Bjoern W.
    [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23301 - 23315
  • [24] Weakly supervised fine-grained recognition based on spatial-channel aware attention filters
    Nannan Yu
    Lei Huang
    Zhiqiang Wei
    Wenfeng Zhang
    Bin Wang
    [J]. Multimedia Tools and Applications, 2021, 80 : 14409 - 14427
  • [25] A weakly supervised spatial group attention network for fine-grained visual recognition
    Jiangjian Xie
    Yujie Zhong
    Junguo Zhang
    Changchun Zhang
    Björn W Schuller
    [J]. Applied Intelligence, 2023, 53 : 23301 - 23315
  • [26] Attention-based supervised contrastive learning on fine-grained image classification
    Li, Qian
    Wu, Weining
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (03)
  • [27] Weakly Supervised Temporal Convolutional Networks for Fine-Grained Surgical Activity Recognition
    Ramesh, Sanat
    Dall'Alba, Diego
    Gonzalez, Cristians
    Yu, Tong
    Mascagni, Pietro
    Mutter, Didier
    Marescaux, Jacques
    Fiorini, Paolo
    Padoy, Nicolas
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (09) : 2592 - 2602
  • [28] Weakly Supervised Fine-Grained Visual Recognition via Adversarial Complementary Attentions and Hierarchical Bilinear Pooling
    Li, Xiaofei
    Liu, Jianming
    Wang, Mingwen
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 74 - 85
  • [29] Destruction and Construction Learning for Fine-grained Image Recognition
    Chen, Yue
    Bai, Yalong
    Zhang, Wei
    Mei, Tao
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5152 - 5161
  • [30] Weakly supervised fine-grained image classification via two-level attention activation model
    Ke, Xiao
    Huang, Yanyan
    Guo, WenZhong
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 218