Fine grained food image recognition based on swin transformer

被引:5
|
作者
Xiao, Zhiyong [1 ,2 ]
Diao, Guang [1 ,2 ]
Deng, Zhaohong [1 ,2 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Peoples R China
[2] Jiangnan Univ, State Key Lab Food Sci & Resources, Wuxi 214122, Peoples R China
基金
中国国家自然科学基金;
关键词
Fine-grained food image recognition; Deep learning; Swin transformer; Food health; Local feature enhancement;
D O I
10.1016/j.jfoodeng.2024.112134
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Fine-grained food image recognition is an important research direction in the field of computer vision and machine learning. However, fine-grained food image recognition faces huge challenges when dealing with foods that vary greatly in shape but belong to the same category or subcategories of that food. To improve this problem, this paper proposes a deep convolution module for obtaining local enhanced feature representation and combines it with the global feature representation obtained from Swin Transformer for deep residual, to obtain a deeper enhanced feature representation. An end-to-end fine-grained food universal classifier was also proposed, which can more accurately extract effective feature information from enhanced feature representations and achieve accurate recognition. Our approach can accurately handle foods with widely different shapes but belonging to the same category and is expected to help people better manage their diet and improve their health. Our models were trained and verified on the public fine-grained food datasets Foodx-251 and UEC Food-256 respectively, where the accuracy of the method on the validation set is 81.07% and 82.77% respectively. Compared with other state-of-the-art self-supervised methods, the method proposed in this paper exhibits higher accuracy in fine-grained food image recognition tasks.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Multi-level information fusion Transformer with background filter for fine-grained image recognition
    Yu, Ying
    Wang, Jinghui
    Pedrycz, Witold
    Miao, Duoqian
    Qian, Jin
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8108 - 8119
  • [32] Survey of Vision Transformer in Fine-Grained Image Classification
    Sun, Lulu
    Liu, Jianping
    Wang, Jian
    Xing, Jialu
    Zhang, Yue
    Wang, Chenyang
    Computer Engineering and Applications, 60 (10): : 30 - 46
  • [33] Swin Transformer Based Pyramid Pooling Network for Food Segmentation
    Wang, Qiankun
    Dong, Xiaoxiao
    Wang, Ruimin
    Sun, Hao
    2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022), 2022, : 64 - 68
  • [34] Swin-CFNet: An Attempt at Fine-Grained Urban Green Space Classification Using Swin Transformer and Convolutional Neural Network
    Wu, Yehong
    Zhang, Meng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [35] Ground-based image deconvolution with Swin Transformer UNet
    Akhaury, U.
    Jablonka, P.
    Starck, J.-L.
    Courbin, F.
    Astronomy and Astrophysics, 2024, 688
  • [36] Depressformer: Leveraging Video Swin Transformer and fine-grained local features for depression scale estimation
    He, Lang
    Li, Zheng
    Tiwari, Prayag
    Cao, Cui
    Xue, Jize
    Zhu, Feng
    Wu, Di
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96
  • [37] Ground-based image deconvolution with Swin Transformer UNet
    Akhaury, U.
    Jablonka, P.
    Starck, J. -l.
    Courbin, F.
    ASTRONOMY & ASTROPHYSICS, 2024, 688
  • [38] Learning to locate for fine-grained image recognition
    Chen, Jiamin
    Hu, Jianguo
    Li, Shiren
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 206
  • [39] Research on plant seeds recognition based on fine-grained image classification
    Yuan, Min
    Dong, Yongkang
    Lu, Fuxiang
    Zhan, Kun
    Zhu, Liye
    Shen, Jiacheng
    Ren, Dingbang
    Hu, Xiaowen
    Lv, Ningning
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (05)
  • [40] Incremental Learning for Fine-Grained Image Recognition
    Cao, Liangliang
    Hsiao, Jenhao
    de Juan, Paloma
    Li, Yuncheng
    Thomee, Bart
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 363 - 366