Attention Guided Food Recognition via Multi-Stage Local Feature Fusion

被引:1
|
作者
Deng, Gonghui [1 ]
Wu, Dunzhi [1 ]
Chen, Weizhen [1 ]
机构
[1] Wuhan Polytech Univ, Sch Elect & Elect Engn, Wuhan 430048, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 80卷 / 02期
关键词
Fine-grained image recognition; food image recognition; attention mechanism; local feature fusion;
D O I
10.32604/cmc.2024.052174
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The task of food image recognition, a nuanced subset of fine-grained image recognition, grapples with substantial intra-class variation and minimal inter-class differences. These challenges are compounded by the irregular and multi-scale nature of food images. Addressing these complexities, our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion, grounded in the ConvNeXt architecture. Our model employs hybrid attention (HA) mechanisms to pinpoint critical discriminative regions within images, substantially mitigating the influence of background noise. Furthermore, it introduces a multi-stage local fusion (MSLF) module, fostering long-distance dependencies between feature maps at varying stages. This approach facilitates the assimilation of complementary features across scales, significantly bolstering the model's capacity for feature extraction. Furthermore, we constructed a dataset named Roushi60, which consists of 60 different categories of common meat dishes. Empirical evaluation of the ETH Food-101, ChineseFoodNet, and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%, 82.86%, and 92.50%, respectively. These figures not only mark an improvement of 1.04%, 3.42%, and 1.36% over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods. Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition, setting a new benchmark for the field.
引用
收藏
页码:1985 / 2003
页数:19
相关论文
共 50 条
  • [41] Multi-stage fusion for face localization
    Belaroussi, R
    Prevost, L
    Milgram, M
    2005 7TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), VOLS 1 AND 2, 2005, : 1218 - 1225
  • [42] SUPERRESOLUTION AND SEGMENTATION OF OCT SCANS USING MULTI-STAGE ADVERSARIAL GUIDED ATTENTION TRAINING
    Jeihouni, Paria
    Dehzangi, Omid
    Amireskandari, Annahita
    Dabouei, Ali
    Rezai, Ali
    Nasrabadi, Nasser M.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1106 - 1110
  • [43] Relation-Guided Multi-stage Feature Aggregation Network for Video Object Detection
    Yao, Tingting
    Cao, Fuxiao
    Mi, Fuheng
    Li, Danmeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 146 - 157
  • [44] Multi-Stage Multi-Task Feature Learning
    Gong, Pinghua
    Ye, Jieping
    Zhang, Changshui
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 2979 - 3010
  • [45] Multi-stage Multi-modalities Fusion of Lip, Tongue and Acoustics Information for Speech Recognition
    Wang, Xuening
    Qian, Zhaopeng
    Yu, Chongchong
    PROCEEDINGS OF 2023 6TH ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE, AICCC 2023, 2023, : 226 - 231
  • [46] Multi-stage multi-task feature learning
    Gong, Pinghua
    Ye, Jieping
    Zhang, Changshui
    Journal of Machine Learning Research, 2013, 14 : 2979 - 3010
  • [47] Enhanced rolling bearing fault diagnosis using a multi-stage attention fusion network
    Ma, Mingyuan
    Qu, Chenxi
    Zhao, Xudong
    Li, Fenglei
    Qu, Shengguan
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2025,
  • [48] Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models
    Belal, Mohammad
    Hassan, Taimur
    Ahmed, Abdelfatah
    Aljarah, Ahmad
    Alsheikh, Nael
    Hussain, Irfan
    2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, AVSS 2024, 2024,
  • [49] Gait recognition via weighted global-local feature fusion and attention-based multiscale temporal aggregation
    Xu, Yingqi
    Xi, Hao
    Ren, Kai
    Zhu, Qiyuan
    Hu, Chuanping
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
  • [50] Multi-scale feature fusion network with local attention for lung segmentation
    Xie, Yinghua
    Zhou, Yuntong
    Wang, Chen
    Ma, Yanshan
    Yang, Ming
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 119