CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer

被引：1

作者：

Yu, Haowen ^{[1
]}

Chen, Liming ^{[2
]}

机构：

[1] Univ Manchester, Fac Biol Med & Hlth, Oxford Rd, Manchester M13 9PL, England

[2] Univ Ulster, Sch Comp, Cromore Rd, Belfast BT52 1SA, North Ireland

来源：

MACHINE VISION AND APPLICATIONS | 2023年 / 34卷 / 06期

关键词：

Transformer; MetaFormer; Attention mechanism; Convolutional neural network;

D O I：

10.1007/s00138-023-01446-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional- and Transformer-based backbone architecture are two dominant, widely accepted, models in computer vision. Nevertheless, it is still a challenge, thus a focus of research, to decide which backbone architecture performs better, and under which circumstances. In this paper, we conduct an in-depth investigation into the differences of the macroscopic backbone design of the CNN and Transformer models with the ultimate purpose of developing new models to combine the strengths of both types of architectures for effective image classification. Specifically, we first analyze the model structures of both models and identified four main differences, then we design four sets of ablation experiments using the ImageNet-1K dataset with an image classification problem as an example to study the impacts of these four differences on model performance. Based on the experimental results, we derive four observations as rules of thumb for designing a vision model backbone architecture. Informed by the experiment findings, we then conceive a novel model called CMNet which marries the experiment-proved best design practices of CNN and Transformer architectures. Finally, we carry out extensive experiments on CMNet using the same dataset against baseline classifiers. Initial results prove CMNet achieves the highest top-1 accuracy of 80.08% on the ImageNet-1K validation set, this is a very competitive value compared to previous classical models with similar computational complexity. Details of the implementation, algorithms and codes, are publicly available on Github: https://github.com/Arwin-Yu/CMNet.

引用

页数：13

共 50 条

[1] CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer
Haowen Yu
Liming Chen
Machine Vision and Applications, 2023, 34
[2] FF-CMNET: A CNN-BASED MODEL FOR FINE-GRAINED CLASSIFICATION OF CAR MODELS BASED ON FEATURE FUSION
Yu, Ye
Jin, Qiang
Chen, Chang Wen
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
[3] Belief Representation of Design Mental Model Based on Design Rationale
Chen Ying
Jing Shikai
Wang Yedong
Cheng Dada
2018 5TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND APPLICATIONS (ICIEA), 2018, : 510 - 514
[4] A Reconstruction Method of the Design Rationale Model Based on Design Context
Liu, Jihong
Zhan, Hongfei
2013 10TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2013, : 492 - 497
[5] A novel CNN template design method based on GIM
Zhao, JY
Meng, HL
Yu, DH
ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 1, PROCEEDINGS, 2005, 3496 : 446 - 454
[6] Optimized Design of Instrument Recognition Based on CNN Model
Jiao, Yanbing
Lin, Xiaoguang
Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
[7] An Attention-Based CNN-LSTM Model with Limb Synergy for Joint Angles Prediction
Zhu, Chang
Liu, Quan
Meng, Wei
Ai, Qingsong
Xie, Sheng Q.
2021 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2021, : 747 - 752
[8] The rationale, design, and progress of two novel maintenance treatment studies in pediatric bipolarity
Findling, RL
Gracious, BL
McNamara, NK
Calabrese, JR
ACTA NEUROPSYCHIATRICA, 2000, 12 (03): : 136 - 138
[9] A rationale-based architecture model for design traceability and reasoning
Tang, Antony
Jin, Yan
Han, Jun
JOURNAL OF SYSTEMS AND SOFTWARE, 2007, 80 (06) : 918 - 934
[10] CNN Model Design of Gesture Recognition Based on Tensorflow Framework
Zeng, Zixian
Gong, Qingge
Zhang, Jun
PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 1062 - 1067

← 1 2 3 4 5 →