Attention Guided Food Recognition via Multi-Stage Local Feature Fusion

被引:1
|
作者
Deng, Gonghui [1 ]
Wu, Dunzhi [1 ]
Chen, Weizhen [1 ]
机构
[1] Wuhan Polytech Univ, Sch Elect & Elect Engn, Wuhan 430048, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 80卷 / 02期
关键词
Fine-grained image recognition; food image recognition; attention mechanism; local feature fusion;
D O I
10.32604/cmc.2024.052174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The task of food image recognition, a nuanced subset of fine-grained image recognition, grapples with substantial intra-class variation and minimal inter-class differences. These challenges are compounded by the irregular and multi-scale nature of food images. Addressing these complexities, our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion, grounded in the ConvNeXt architecture. Our model employs hybrid attention (HA) mechanisms to pinpoint critical discriminative regions within images, substantially mitigating the influence of background noise. Furthermore, it introduces a multi-stage local fusion (MSLF) module, fostering long-distance dependencies between feature maps at varying stages. This approach facilitates the assimilation of complementary features across scales, significantly bolstering the model's capacity for feature extraction. Furthermore, we constructed a dataset named Roushi60, which consists of 60 different categories of common meat dishes. Empirical evaluation of the ETH Food-101, ChineseFoodNet, and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%, 82.86%, and 92.50%, respectively. These figures not only mark an improvement of 1.04%, 3.42%, and 1.36% over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods. Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition, setting a new benchmark for the field.
引用
收藏
页码:1985 / 2003
页数:19
相关论文
共 50 条
  • [21] Attention-based adaptive feature selection for multi-stage image dehazing
    Li, Xiaoling
    Hua, Zhen
    Li, Jinjiang
    VISUAL COMPUTER, 2023, 39 (02): : 663 - 678
  • [22] Enhanced RGB-T saliency detection via thermal-guided multi-stage attention network
    Pang, Yu
    Huang, Yang
    Weng, Chenyu
    Lyu, Jialin
    Bai, Chuanyue
    Yu, Xiaosheng
    VISUAL COMPUTER, 2025,
  • [23] MSFA: Multi-stage feature aggregation network for multi-label image recognition
    Chen, Jiale
    Xu, Feng
    Zeng, Tao
    Li, Xin
    Chen, Shangjing
    Yu, Jie
    IET IMAGE PROCESSING, 2024, 18 (07) : 1862 - 1877
  • [24] Attention-based adaptive feature selection for multi-stage image dehazing
    Xiaoling Li
    Zhen Hua
    Jinjiang Li
    The Visual Computer, 2023, 39 : 663 - 678
  • [25] Emotion Recognition via Multiscale Feature Fusion Network and Attention Mechanism
    Jiang, Yiye
    Xie, Songyun
    Xie, Xinzhou
    Cui, Yujie
    Tang, Hao
    IEEE SENSORS JOURNAL, 2023, 23 (10) : 10790 - 10800
  • [26] Attention-guided image captioning with adaptive global and local feature fusion
    Zhong, Xian
    Nie, Guozhang
    Huang, Wenxin
    Liu, Wenxuan
    Ma, Bo
    Lin, Chia-Wen
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [27] A Multi-Stage Visible and Infrared Image Fusion Network Based on Attention Mechanism
    Zheng, Xin
    Yang, Qiyong
    Si, Pengbo
    Wu, Qiang
    SENSORS, 2022, 22 (10)
  • [28] A Multi-Stage Progressive Network with Feature Transmission and Fusion for Marine Snow Removal
    Liu, Lixin
    Liao, Yuyang
    He, Bo
    Kwan, Chiman
    SENSORS, 2024, 24 (02)
  • [29] Multi-Stage Feature Fusion Object Detection Method for Remote Sensing Image
    Chen L.
    Zhang F.
    Guo W.
    Huang Y.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (12): : 3520 - 3528
  • [30] Scene text recognition based on two-stage attention and multi-branch feature fusion module
    Shifeng Xia
    Jinqiao Kou
    Ningzhong Liu
    Tianxiang Yin
    Applied Intelligence, 2023, 53 : 14219 - 14232