CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer

被引:1
|
作者
Yu, Haowen [1 ]
Chen, Liming [2 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Oxford Rd, Manchester M13 9PL, England
[2] Univ Ulster, Sch Comp, Cromore Rd, Belfast BT52 1SA, North Ireland
关键词
Transformer; MetaFormer; Attention mechanism; Convolutional neural network;
D O I
10.1007/s00138-023-01446-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional- and Transformer-based backbone architecture are two dominant, widely accepted, models in computer vision. Nevertheless, it is still a challenge, thus a focus of research, to decide which backbone architecture performs better, and under which circumstances. In this paper, we conduct an in-depth investigation into the differences of the macroscopic backbone design of the CNN and Transformer models with the ultimate purpose of developing new models to combine the strengths of both types of architectures for effective image classification. Specifically, we first analyze the model structures of both models and identified four main differences, then we design four sets of ablation experiments using the ImageNet-1K dataset with an image classification problem as an example to study the impacts of these four differences on model performance. Based on the experimental results, we derive four observations as rules of thumb for designing a vision model backbone architecture. Informed by the experiment findings, we then conceive a novel model called CMNet which marries the experiment-proved best design practices of CNN and Transformer architectures. Finally, we carry out extensive experiments on CMNet using the same dataset against baseline classifiers. Initial results prove CMNet achieves the highest top-1 accuracy of 80.08% on the ImageNet-1K validation set, this is a very competitive value compared to previous classical models with similar computational complexity. Details of the implementation, algorithms and codes, are publicly available on Github: https://github.com/Arwin-Yu/CMNet.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] A novel method based on CNN-BiGRU and AM model for bearing fault diagnosis
    Xu, Ziwei
    Li, Yan-Feng
    Huang, Hong-Zhong
    Deng, Zhiming
    Huang, Zixing
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2024, 38 (07) : 3361 - 3369
  • [32] Comparison Test Based on Program Model and Design Document
    Li, Shan-Ling
    Hui, Zhan-Wei
    Zheng, Chang-You
    IEEE ACCESS, 2021, 9 : 34778 - 34788
  • [33] A novel specification model for IP-based design
    Klaus, S
    Huss, SA
    EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS, 2003, : 190 - 196
  • [34] Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison
    Dai, Qi
    Yang, Yanchun
    Wang, Tianming
    BIOINFORMATICS, 2008, 24 (20) : 2296 - 2302
  • [35] RA-XTNet: A Novel CNN Model to Predict Rheumatoid Arthritis from Hand Radiographs and Thermal Images: A Comparison with CNN Transformer and Quantum Computing
    Kesavapillai, Ahalya R.
    Aslam, Shabnam M.
    Umapathy, Snekhalatha
    Almutairi, Fadiyah
    DIAGNOSTICS, 2024, 14 (17)
  • [36] Design and studies of novel polyoxysterol-based porphyrin conjugates
    Zhylitskaya, Halina A.
    Zhabinskii, Vladimir N.
    Litvinovskaya, Raisa P.
    Lettieri, Raffaella
    Monti, Donato
    Venanzi, Mariano
    Khripach, Vladimir A.
    Drasar, Pavel
    STEROIDS, 2012, 77 (11) : 1169 - 1175
  • [37] Design of Chinese painting style classification model based on multi-layer aggregation CNN
    Du, Xiaofang
    Cai, Yangfeng
    PeerJ Computer Science, 2024, 10
  • [38] Design of Chinese painting style classification model based on multi-layer aggregation CNN
    Du, Xiaofang
    Cai, Yangfeng
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [39] A New Model Design for Combating COVID-19 Pandemic Based on SVM and CNN Approaches
    Alnedawe, Sura Monther
    Aljobouri, Hadeel K.
    BAGHDAD SCIENCE JOURNAL, 2023, 20 (04) : 1402 - 1413
  • [40] Novel pyrazole derivatives: rationale design, synthesis, sar study and biological potential based on In Vitro Study
    Singh, Sucheta
    Tahlan, Sumit
    Singh, Kuldeep
    Verma, Prabhakar Kumar
    JOURNAL OF MOLECULAR STRUCTURE, 2024, 1310