CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer

被引:1
|
作者
Yu, Haowen [1 ]
Chen, Liming [2 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Oxford Rd, Manchester M13 9PL, England
[2] Univ Ulster, Sch Comp, Cromore Rd, Belfast BT52 1SA, North Ireland
关键词
Transformer; MetaFormer; Attention mechanism; Convolutional neural network;
D O I
10.1007/s00138-023-01446-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional- and Transformer-based backbone architecture are two dominant, widely accepted, models in computer vision. Nevertheless, it is still a challenge, thus a focus of research, to decide which backbone architecture performs better, and under which circumstances. In this paper, we conduct an in-depth investigation into the differences of the macroscopic backbone design of the CNN and Transformer models with the ultimate purpose of developing new models to combine the strengths of both types of architectures for effective image classification. Specifically, we first analyze the model structures of both models and identified four main differences, then we design four sets of ablation experiments using the ImageNet-1K dataset with an image classification problem as an example to study the impacts of these four differences on model performance. Based on the experimental results, we derive four observations as rules of thumb for designing a vision model backbone architecture. Informed by the experiment findings, we then conceive a novel model called CMNet which marries the experiment-proved best design practices of CNN and Transformer architectures. Finally, we carry out extensive experiments on CMNet using the same dataset against baseline classifiers. Initial results prove CMNet achieves the highest top-1 accuracy of 80.08% on the ImageNet-1K validation set, this is a very competitive value compared to previous classical models with similar computational complexity. Details of the implementation, algorithms and codes, are publicly available on Github: https://github.com/Arwin-Yu/CMNet.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer
    Haowen Yu
    Liming Chen
    Machine Vision and Applications, 2023, 34
  • [2] FF-CMNET: A CNN-BASED MODEL FOR FINE-GRAINED CLASSIFICATION OF CAR MODELS BASED ON FEATURE FUSION
    Yu, Ye
    Jin, Qiang
    Chen, Chang Wen
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [3] Belief Representation of Design Mental Model Based on Design Rationale
    Chen Ying
    Jing Shikai
    Wang Yedong
    Cheng Dada
    2018 5TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND APPLICATIONS (ICIEA), 2018, : 510 - 514
  • [4] A Reconstruction Method of the Design Rationale Model Based on Design Context
    Liu, Jihong
    Zhan, Hongfei
    2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 492 - 497
  • [5] A novel CNN template design method based on GIM
    Zhao, JY
    Meng, HL
    Yu, DH
    ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 1, PROCEEDINGS, 2005, 3496 : 446 - 454
  • [6] Optimized Design of Instrument Recognition Based on CNN Model
    Jiao, Yanbing
    Lin, Xiaoguang
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [7] An Attention-Based CNN-LSTM Model with Limb Synergy for Joint Angles Prediction
    Zhu, Chang
    Liu, Quan
    Meng, Wei
    Ai, Qingsong
    Xie, Sheng Q.
    2021 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2021, : 747 - 752
  • [8] The rationale, design, and progress of two novel maintenance treatment studies in pediatric bipolarity
    Findling, RL
    Gracious, BL
    McNamara, NK
    Calabrese, JR
    ACTA NEUROPSYCHIATRICA, 2000, 12 (03): : 136 - 138
  • [9] A rationale-based architecture model for design traceability and reasoning
    Tang, Antony
    Jin, Yan
    Han, Jun
    JOURNAL OF SYSTEMS AND SOFTWARE, 2007, 80 (06) : 918 - 934
  • [10] CNN Model Design of Gesture Recognition Based on Tensorflow Framework
    Zeng, Zixian
    Gong, Qingge
    Zhang, Jun
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 1062 - 1067