Balanced Mixture of SuperNets for Learning the CNN Pooling Architecture

被引:0
|
作者
Roshtkhari, Mehraveh Javan [1 ]
Toews, Matthew [1 ]
Pedersoli, Marco [1 ]
机构
[1] Ecole Technol Super ETS, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SEARCH; NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Downsampling layers, including pooling and strided convolutions, are crucial components of the convolutional neural network architecture that determine both the granularity/scale of image feature analysis as well as the receptive field size of a given layer. To fully understand this problem, we analyse the performance of models independently trained with each pooling configurations on CIFAR10, using a ResNet20 network and show that the position of the downsampling layers can highly influence the performance of a network and predefined downsampling configurations are not optimal. Network Architecture Search (NAS) might be used to optimize downsampling configurations as an hyperparameter. However, we find that common one-shot NAS based on a single SuperNet does not work for this problem. We argue that this is because a SuperNet trained for finding the optimal pooling configuration fully shares its parameters among all pooling configurations. This makes its training hard because learning some configurations can harm the performance of others. Therefore, we propose a balanced mixture of SuperNets that automatically associates pooling configurations to different weight models and helps to reduce the weight-sharing and interinfluence of pooling configurations on the SuperNet parameters. We evaluate our proposed approach on CIFAR10, CIFAR100, as well as Food101, and show that in all cases our model outperforms other approaches and improves over the default pooling configurations.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] A mix-pooling CNN architecture with FCRF for brain tumor segmentation
    Chang, Jie
    Zhang, Luming
    Gu, Naijie
    Zhang, Xiaoci
    Ye, Minquan
    Yin, Rongzhang
    Meng, Qianqian
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 58 : 316 - 322
  • [2] A Max Pooling Hardware Architecture Supporting Inference And Training For CNN Accelerators
    Kim, Sanghyun
    Lee, Eunchong
    Lee, Minkyu
    Kim, Kyungho
    Lee, Sang-Seol
    Jang, Sung-Joon
    [J]. 2023 20TH INTERNATIONAL SOC DESIGN CONFERENCE, ISOCC, 2023, : 313 - 314
  • [3] Incorporating Transfer Learning in CNN Architecture
    Gurjar, Aparna
    Voditel, Preeti
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 199 - 207
  • [4] Learning and Pooling, Pooling and Learning
    Stewart, Rush T.
    Quintana, Ignacio Ojea
    [J]. ERKENNTNIS, 2018, 83 (03) : 369 - 389
  • [5] Learning and Pooling, Pooling and Learning
    Rush T. Stewart
    Ignacio Ojea Quintana
    [J]. Erkenntnis, 2018, 83 : 369 - 389
  • [6] Mini Pool : Pooling hardware architecture using minimized local memory for CNN accelerators
    Lee, Eunchong
    Lee, Sang-Seol
    Sung, Minyong
    Jang, Sung-Joon
    Choi, Byoung-Ho
    [J]. 2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
  • [7] Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling
    Deng, Chunyuan
    Peng, Zhenyun
    Chen, Zhencheng
    Chen, Ruixing
    [J]. SENSORS, 2023, 23 (02)
  • [8] A CNN Architecture for Learning Device Activity From MMV
    Wu, Xiaofu
    Zhang, Suofei
    Yan, Jun
    [J]. IEEE COMMUNICATIONS LETTERS, 2021, 25 (09) : 2933 - 2937
  • [9] CmpCNN: CMP Modeling with Transfer Learning CNN Architecture
    Zhang, Qing
    Huang, Huajie
    Li, Jizuo
    Zhang, Yuhang
    Li, Yongfu
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
  • [10] A Siamese CNN Architecture for Learning Chinese Sentence Similarity
    Shi, Haoxiang
    Wang, Cen
    Sakai, Tetsuya
    [J]. AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 24 - 29