Fine grained food image recognition based on swin transformer

被引:5
|
作者
Xiao, Zhiyong [1 ,2 ]
Diao, Guang [1 ,2 ]
Deng, Zhaohong [1 ,2 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Peoples R China
[2] Jiangnan Univ, State Key Lab Food Sci & Resources, Wuxi 214122, Peoples R China
基金
中国国家自然科学基金;
关键词
Fine-grained food image recognition; Deep learning; Swin transformer; Food health; Local feature enhancement;
D O I
10.1016/j.jfoodeng.2024.112134
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Fine-grained food image recognition is an important research direction in the field of computer vision and machine learning. However, fine-grained food image recognition faces huge challenges when dealing with foods that vary greatly in shape but belong to the same category or subcategories of that food. To improve this problem, this paper proposes a deep convolution module for obtaining local enhanced feature representation and combines it with the global feature representation obtained from Swin Transformer for deep residual, to obtain a deeper enhanced feature representation. An end-to-end fine-grained food universal classifier was also proposed, which can more accurately extract effective feature information from enhanced feature representations and achieve accurate recognition. Our approach can accurately handle foods with widely different shapes but belonging to the same category and is expected to help people better manage their diet and improve their health. Our models were trained and verified on the public fine-grained food datasets Foodx-251 and UEC Food-256 respectively, where the accuracy of the method on the validation set is 81.07% and 82.77% respectively. Compared with other state-of-the-art self-supervised methods, the method proposed in this paper exhibits higher accuracy in fine-grained food image recognition tasks.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A Sequence-selective Fine-grained Image Recognition Strategy Using Vision Transformer
    Cai, Yulin
    Wang, Haoqian
    Wang, Xingzheng
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST 2022), 2022,
  • [22] A Single Image Deraining Algorithm Based on Swin Transformer
    Gao T.
    Wen Y.
    Chen T.
    Zhang J.
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2023, 57 (05): : 613 - 623
  • [23] Fisheye image rectification and restoration based on Swin Transformer
    Xu, Jian
    Han, Dewei
    Li, Kang
    Li, Junjie
    Ma, Zhaoyuan
    IET IMAGE PROCESSING, 2025, 19 (01)
  • [24] Summary of Fine-Grained Image Recognition Based on Attention Mechanism
    Yao, Ma
    Min, Zhi
    THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
  • [25] Swin-TransUper: Swin Transformer-based UperNet for medical image segmentation
    Yin J.
    Chen Y.
    Li C.
    Zheng Z.
    Gu Y.
    Zhou J.
    Multimedia Tools and Applications, 2024, 83 (42) : 89817 - 89836
  • [26] Convolutional transformer network for fine-grained action recognition
    Ma, Yujun
    Wang, Ruili
    Zong, Ming
    Ji, Wanting
    Wang, Yi
    Ye, Baoliu
    NEUROCOMPUTING, 2024, 569
  • [27] Multimodal Fine-Grained Transformer Model for Pest Recognition
    Zhang, Yinshuo
    Chen, Lei
    Yuan, Yuan
    ELECTRONICS, 2023, 12 (12)
  • [28] Swin-FER: Swin Transformer for Facial Expression Recognition
    Bie, Mei
    Xu, Huan
    Gao, Yan
    Song, Kai
    Che, Xiangjiu
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [29] Research on Soybean Seedling Stage Recognition Based on Swin Transformer
    Ma, Kai
    Qiu, Jinkai
    Kang, Ye
    Qi, Liqiang
    Zhang, Wei
    Wang, Song
    Xu, Xiuying
    AGRONOMY-BASEL, 2024, 14 (11):
  • [30] Chinese Character Recognition based on Swin Transformer-Encoder ☆
    Li, Ziying
    Zhao, Haifeng
    Nishizaki, Hiromitsu
    Leow, Chee Siang
    Shen, Xingfa
    DIGITAL SIGNAL PROCESSING, 2025, 161