Fine-grained Image Classification by Visual-Semantic Embedding

被引:0
|
作者
Xu, Huapeng [1 ]
Qi, Guilin [1 ]
Li, Jingjing [2 ]
Wang, Meng [3 ]
Xu, Kang [4 ]
Gao, Huan [1 ]
机构
[1] Southeast Univ, Nanjing, Peoples R China
[2] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[3] Xi An Jiao Tong Univ, Xian, Peoples R China
[4] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates a challenging problem, which is known as fine-grained image classification (FGIC). Different from conventional computer vision problems, FGIC suffers from the large intraclass diversities and subtle inter-class differences. Existing FGIC approaches are limited to explore only the visual information embedded in the images. In this paper, we present a novel approach which can use handy prior knowledge from either structured knowledge bases or unstructured text to facilitate FGIC. Specifically, we propose a visual-semantic embedding model which explores semantic embedding from knowledge bases and text, and further trains a novel end-to-end CNN framework to linearly map image features to a rich semantic embedding space. Experimental results on a challenging large-scale UCSD Bird-200-2011 dataset verify that our approach outperforms several state-of-the-art methods with significant advances.
引用
收藏
页码:1043 / 1049
页数:7
相关论文
共 50 条
  • [21] Generating Knowledge-Enriched Image Annotations for Fine-Grained Visual Classification
    Murabito, Francesca
    Palazzo, Simone
    Spampinato, Concetto
    Giordano, Daniela
    [J]. IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 332 - 344
  • [22] Ladder Loss for Coherent Visual-Semantic Embedding
    Zhou, Mo
    Niu, Zhenxing
    Wang, Le
    Gao, Zhanning
    Zhang, Qilin
    Hua, Gang
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13050 - 13057
  • [23] MASK-VIT: AN OBJECT MASK EMBEDDING IN VISION TRANSFORMER FOR FINE-GRAINED VISUAL CLASSIFICATION
    Su, Tong
    Ye, Shuo
    Song, Chengqun
    Cheng, Jun
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1626 - 1630
  • [24] Fine-Grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding
    Chen, Tianshui
    Wu, Wenxi
    Gao, Yuefang
    Dong, Le
    Luo, Xiaonan
    Lin, Liang
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2023 - 2031
  • [25] Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval
    Ueki, Kazuya
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 628 - 634
  • [26] Exploration of Class Center for Fine-Grained Visual Classification
    Yao, Hang
    Miao, Qiguang
    Zhao, Peipei
    Li, Chaoneng
    Li, Xin
    Feng, Guanwen
    Liu, Ruyi
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 9954 - 9966
  • [27] Adaptive Destruction Learning for Fine-grained Visual Classification
    Zhang, Riheng
    Tan, Min
    Mao, Xiaoyang
    Gao, Zhigang
    Gu, Xiaoling
    [J]. 2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 946 - 950
  • [28] A sparse focus framework for visual fine-grained classification
    Wang, YongXiong
    Li, Guangjun
    Ma, Li
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (16) : 25271 - 25289
  • [29] Click-through-based Deep Visual-Semantic Embedding for Image Search
    Liu, Yuan
    Shi, Zhongchao
    Li, Xue
    Wang, Gang
    [J]. MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 955 - 958
  • [30] A sparse focus framework for visual fine-grained classification
    YongXiong Wang
    Guangjun Li
    Li Ma
    [J]. Multimedia Tools and Applications, 2021, 80 : 25271 - 25289