Attention Guided Food Recognition via Multi-Stage Local Feature Fusion

被引:1
|
作者
Deng, Gonghui [1 ]
Wu, Dunzhi [1 ]
Chen, Weizhen [1 ]
机构
[1] Wuhan Polytech Univ, Sch Elect & Elect Engn, Wuhan 430048, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 80卷 / 02期
关键词
Fine-grained image recognition; food image recognition; attention mechanism; local feature fusion;
D O I
10.32604/cmc.2024.052174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The task of food image recognition, a nuanced subset of fine-grained image recognition, grapples with substantial intra-class variation and minimal inter-class differences. These challenges are compounded by the irregular and multi-scale nature of food images. Addressing these complexities, our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion, grounded in the ConvNeXt architecture. Our model employs hybrid attention (HA) mechanisms to pinpoint critical discriminative regions within images, substantially mitigating the influence of background noise. Furthermore, it introduces a multi-stage local fusion (MSLF) module, fostering long-distance dependencies between feature maps at varying stages. This approach facilitates the assimilation of complementary features across scales, significantly bolstering the model's capacity for feature extraction. Furthermore, we constructed a dataset named Roushi60, which consists of 60 different categories of common meat dishes. Empirical evaluation of the ETH Food-101, ChineseFoodNet, and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%, 82.86%, and 92.50%, respectively. These figures not only mark an improvement of 1.04%, 3.42%, and 1.36% over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods. Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition, setting a new benchmark for the field.
引用
收藏
页码:1985 / 2003
页数:19
相关论文
共 50 条
  • [31] Scene text recognition based on two-stage attention and multi-branch feature fusion module
    Xia, Shifeng
    Kou, Jinqiao
    Liu, Ningzhong
    Yin, Tianxiang
    APPLIED INTELLIGENCE, 2023, 53 (11) : 14219 - 14232
  • [32] Improved AED with multi-stage feature extraction and fusion based on RFAConv and PSA
    Wang, Bingbing
    Wei, Yangjie
    Wang, Zhuangzhuang
    Qi, Zekang
    SPEECH COMMUNICATION, 2025, 167
  • [33] G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition
    Tang, Pengjie
    Wang, Hanli
    Kwong, Sam
    NEUROCOMPUTING, 2017, 225 : 188 - 197
  • [34] Car license plate feature extraction and recognition based on multi-stage classifier
    Han, P
    Han, W
    Wang, DF
    Zhai, YJ
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 128 - 132
  • [35] Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection
    Liogiene, Tatjana
    Tamulevicius, Gintautas
    ELECTRICAL CONTROL AND COMMUNICATION ENGINEERING, 2016, 10 (01) : 35 - 41
  • [36] Enhancing Feature Representation for Anomaly Detection via Local-and-Global Temporal Relations and a Multi-stage Memory
    Li, Xuan
    Ma, Ding
    Wu, Xiangqian
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2024, 14430 LNCS : 121 - 133
  • [37] Multi-attention guided feature fusion network for salient object detection
    Li, Anni
    Qi, JinQing
    Lu, Huchuan
    NEUROCOMPUTING, 2020, 411 : 416 - 427
  • [38] Attention Guided Multi Scale Feature Fusion Network for Automatic Prostate Segmentation
    Li, Yuchun
    Huang, Mengxing
    Zhang, Yu
    Bai, Zhiming
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (02): : 1649 - 1668
  • [39] DLT-Embryo: A Dual-branch Local feature fusion enhanced Transformer for Embryo multi-stage classification
    Liu, Xiaojie
    Yu, Mengxin
    Liu, Haihui
    Ma, Chuanlong
    Du, Wenbin
    Wu, Haicui
    Zhang, Yuang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 102
  • [40] Enhancing Feature Representation for Anomaly Detection via Local-and-Global Temporal Relations and a Multi-stage Memory
    Li, Xuan
    Ma, Ding
    Wu, Xiangqian
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 121 - 133