CMNet: a novel model and design rationale based on comparison studies and synergy of CNN and MetaFormer

被引:1
|
作者
Yu, Haowen [1 ]
Chen, Liming [2 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Oxford Rd, Manchester M13 9PL, England
[2] Univ Ulster, Sch Comp, Cromore Rd, Belfast BT52 1SA, North Ireland
关键词
Transformer; MetaFormer; Attention mechanism; Convolutional neural network;
D O I
10.1007/s00138-023-01446-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional- and Transformer-based backbone architecture are two dominant, widely accepted, models in computer vision. Nevertheless, it is still a challenge, thus a focus of research, to decide which backbone architecture performs better, and under which circumstances. In this paper, we conduct an in-depth investigation into the differences of the macroscopic backbone design of the CNN and Transformer models with the ultimate purpose of developing new models to combine the strengths of both types of architectures for effective image classification. Specifically, we first analyze the model structures of both models and identified four main differences, then we design four sets of ablation experiments using the ImageNet-1K dataset with an image classification problem as an example to study the impacts of these four differences on model performance. Based on the experimental results, we derive four observations as rules of thumb for designing a vision model backbone architecture. Informed by the experiment findings, we then conceive a novel model called CMNet which marries the experiment-proved best design practices of CNN and Transformer architectures. Finally, we carry out extensive experiments on CMNet using the same dataset against baseline classifiers. Initial results prove CMNet achieves the highest top-1 accuracy of 80.08% on the ImageNet-1K validation set, this is a very competitive value compared to previous classical models with similar computational complexity. Details of the implementation, algorithms and codes, are publicly available on Github: https://github.com/Arwin-Yu/CMNet.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Kuaba ontology:: Design rationale representation and reuse in model-based designs
    de Medeiros, AP
    Schwabe, D
    Feijó, B
    CONCEPTUAL MODELING - ER 2005, 2005, 3716 : 241 - 255
  • [22] Dynamic speaker localization based on a novel lightweight R-CNN model
    Catalbas, Mehmet Cem
    Dobrisek, Simon
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (14): : 10589 - 10603
  • [23] FCPNet : A novel model to predict forward collision based-upon CNN
    Olou, Herve B.
    Ezin, Eugene C.
    Dembele, Jean Marie
    Cambier, Christophe
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1327 - 1332
  • [24] Model based design of experiments - case studies
    Degerman, Marcus
    Sejergaard, Lars
    Hansen, Ernst Broberg
    Ludvig, Anne-Merete
    Riis, Else Bang
    Jensby, Kasper Glenstrup
    Staby, Arne
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2011, 241
  • [25] A Novel Library Design Model Based on SMS
    Dong Xiaoxia
    Du Zhidian
    Li Gaohu
    Shi Ling
    2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 396 - +
  • [26] The design of a bionic sensory chip based on the CNN model derived from the mammalian retina
    Yang, WC
    Lin, LJ
    Wu, CY
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 371 - 375
  • [27] A novel model based on CNN for improving computation efficiency on arrhythmia detection by combining HMM
    Pan, Shing-Tai
    Wu, Cheng-Hao
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 106
  • [28] Roofline-Model-Based Design Space Exploration for Dataflow Techniques of CNN Accelerators
    Park, Chan
    Park, Sungkyung
    Park, Chester Sungchung
    IEEE ACCESS, 2020, 8 : 172509 - 172523
  • [29] A novel method based on a Mask R-CNN model for processing dPCR images
    Hu, Zhenming
    Fang, Weibo
    Gou, Tong
    Wu, Wenshuai
    Hu, Jiumei
    Zhou, Shufang
    Mu, Ying
    ANALYTICAL METHODS, 2019, 11 (27) : 3410 - 3418
  • [30] POPULATION-BASED STUDIES OF CHILDHOOD DISABILITY IN DEVELOPING-COUNTRIES - RATIONALE AND STUDY DESIGN
    DURKIN, M
    ZAMAN, S
    THORBURN, M
    HASAN, M
    DAVIDSON, L
    INTERNATIONAL JOURNAL OF MENTAL HEALTH, 1991, 20 (02) : 47 - 60