Regularization of deep neural network using a multisample memory model

被引:0
|
作者
Muhammad Tanveer [1 ]
Mohammad Yakoob Siyal [1 ]
Sheikh Faisal Rashid [2 ]
机构
[1] Nanyang Technological University,School of Electrical and Electronics Engineering
[2] Berlin Educational Technology Lab (EdTec),German Research Center for Artificial Intelligence (DFKI)
关键词
Deeper architecture; Overfitting; Regularization; Bag sampling; Memory model; Superfast convergence;
D O I
10.1007/s00521-024-10474-x
中图分类号
学科分类号
摘要
Deep convolutional neural networks (CNNs) are widely used in computer vision and have achieved significant performance for image classification tasks. Overfitting is a general problem in deep learning models that inhibit the generalization capability of deep models due to the presence of noise, the limited size of the training data, the complexity of the classifier, and the larger number of hyperparameters involved during training. Several techniques have been developed for overfitting inhibition, but in this research we focus only on regularization techniques. We propose a memory-based regularization technique to inhibit overfitting problems and generalize the performance of deep neural networks. Our backbone architectures receive input samples in bags rather than directly in batches to generate deep features. The proposed model receives input samples as queries and feeds them to the MAM (memory access module), which searches for the relevant items in memory and computes memory loss using Euclidean similarity measures. Our memory loss function incorporates intra-class compactness and inter-class separability at the feature level. Most surprisingly, the convergence rate of the proposed model is superfast, requiring only a few epochs to train both shallow and deeper models. In this study, we evaluate the performance of the memory model across several state-of-the-art (SOTA) deep learning architectures, including ReseNet18, ResNet50, ResNet101, VGG-16, AlexNet, and MobileNet, using the CIFAR-10 and CIFAR-100 datasets. The results show that the efficient memory model we have developed significantly outperforms almost all existing SOTA benchmarks by a considerable margin.
引用
收藏
页码:23295 / 23307
页数:12
相关论文
共 50 条
  • [31] Neural network model of spatial memory
    Fukushima, K
    Yamaguchi, Y
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 548 - 553
  • [32] MULTI-LEVEL DEEP NEURAL NETWORK ADAPTATION FOR SPEAKER VERIFICATION USING MMD AND CONSISTENCY REGULARIZATION
    Lin, Weiwei
    Mak, Man-Mai
    Li, Na
    Su, Dan
    Yu, Dong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6839 - 6843
  • [33] A Spiking Neural Network Model for Associative Memory Using Temporal Codes
    Hu, Jun
    Tang, Huajin
    Tan, Kay Chen
    Gee, Sen Bong
    PROCEEDINGS OF THE 18TH ASIA PACIFIC SYMPOSIUM ON INTELLIGENT AND EVOLUTIONARY SYSTEMS, VOL 1, 2015, : 561 - 572
  • [34] The Analysis of Regularization in Deep Neural Networks Using Metagraph Approach
    Fedorenko, Yuriy S.
    Gapanyuk, Yuriy E.
    Minakova, Svetlana V.
    ADVANCES IN NEURAL COMPUTATION, MACHINE LEARNING, AND COGNITIVE RESEARCH, 2018, 736 : 3 - 8
  • [35] Deep memory and prediction neural network for video prediction
    Liu, Zhipeng
    Chai, Xiujuan
    Chen, Xilin
    NEUROCOMPUTING, 2019, 331 : 235 - 241
  • [36] Hierarchical Approximate Memory for Deep Neural Network Applications
    Ha, Minho
    Hwang, Seokha
    Kim, Jeonghun
    Lee, Youngjoo
    Lee, Sunggu
    2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 261 - 266
  • [37] MEMORY REDUCTION METHOD FOR DEEP NEURAL NETWORK TRAINING
    Shirahata, Koichi
    Tomita, Yasumoto
    Ike, Atsushi
    2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [38] A Survey on Memory Subsystems for Deep Neural Network Accelerators
    Asad, Arghavan
    Kaur, Rupinder
    Mohammadi, Farah
    FUTURE INTERNET, 2022, 14 (05):
  • [39] Speech Recognition Model for Assamese Language Using Deep Neural Network
    Singh, Moirangthem Tiken
    Barman, Partha Pratim
    Gogoi, Rupjyoti
    2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
  • [40] Devanagri character recognition model using deep convolution neural network
    Ram, Shrawan
    Gupta, Shloak
    Agarwal, Basant
    JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS, 2018, 21 (04): : 593 - 599