Dynamic Convolutional Neural Networks as Efficient Pre-Trained Audio Models

被引:0
|
作者
Schmid, Florian [1 ]
Koutini, Khaled [1 ,2 ]
Widmer, Gerhard [1 ,2 ]
机构
[1] Johannes Kepler Univ Linz, Inst Computat Percept CP JKU, A-4040 Linz, Austria
[2] Johannes Kepler Univ Linz, LIT Artificial Intelligence Lab, A-4040 Linz, Austria
基金
欧洲研究理事会;
关键词
Dynamic convolutional neural networks; dynamic convolution; dynamic ReLU; coordinate attention; audio spectrogram transformer; audio classification; pre-trained audio models; knowledge distillation;
D O I
10.1109/TASLP.2024.3376984
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The introduction of large-scale audio datasets, such as AudioSet, paved the way for Transformers to conquer the audio domain and replace CNNs as the state-of-the-art neural network architecture for many tasks. Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks. However, current popular Audio Spectrogram Transformers are demanding in terms of computational complexity compared to CNNs. Recently, we have shown that, by employing Transformer-to-CNN Knowledge Distillation, efficient CNNs can catch up with and even outperform Transformers on large datasets. In this work, we extend this line of research and increase the capacity of efficient CNNs by introducing dynamic CNN blocks constructed of dynamic convolutions, a dynamic ReLU activation function, and Coordinate Attention. We show that these dynamic CNNs outperform traditional efficient CNNs, such as MobileNets, in terms of the performance-complexity trade-off at the task of audio tagging on the large-scale AudioSet. Our experiments further indicate that the proposed dynamic CNNs achieve competitive performance with Transformer-based models for end-to-end fine-tuning on downstream tasks while being much more computationally efficient.
引用
收藏
页码:2227 / 2241
页数:15
相关论文
共 50 条
  • [1] Efficient Aspect Object Models Using Pre-trained Convolutional Neural Networks
    Wilkinson, Eric
    Takahashi, Takeshi
    [J]. 2015 IEEE-RAS 15TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2015, : 284 - 289
  • [2] Pre-trained Convolutional Neural Networks for the Lung Sounds Classification
    Vaityshyn, Valentyn
    Porieva, Hanna
    Makarenkova, Anastasiia
    [J]. 2019 IEEE 39TH INTERNATIONAL CONFERENCE ON ELECTRONICS AND NANOTECHNOLOGY (ELNANO), 2019, : 522 - 525
  • [3] Graph-Based Audio Classification Using Pre-Trained Models and Graph Neural Networks
    Castro-Ospina, Andres Eduardo
    Solarte-Sanchez, Miguel Angel
    Vega-Escobar, Laura Stella
    Isaza, Claudia
    Martinez-Vargas, Juan David
    [J]. SENSORS, 2024, 24 (07)
  • [4] An efficient brain tumor detection and classification using pre-trained convolutional neural network models
    Rao, K. Nishanth
    Khalaf, Osamah Ibrahim
    Krishnasree, V.
    Kumar, Aruru Sai
    Alsekait, Deema Mohammed
    Priyanka, S. Siva
    Alattas, Ahmed Saleh
    AbdElminaam, Diaa Salama
    [J]. HELIYON, 2024, 10 (17)
  • [5] Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study
    Rostami, Masoud A.
    Balmaki, Behnaz
    Dyer, Lee A.
    Allen, Julie M.
    Sallam, Mohamed F.
    Frontalini, Fabrizio
    [J]. JOURNAL OF BIG DATA, 2023, 10 (01)
  • [6] Efficient pollen grain classification using pre-trained Convolutional Neural Networks: a comprehensive study
    Masoud A. Rostami
    Behnaz Balmaki
    Lee A. Dyer
    Julie M. Allen
    Mohamed F. Sallam
    Fabrizio Frontalini
    [J]. Journal of Big Data, 10
  • [7] Performance Improvement Of Pre-trained Convolutional Neural Networks For Action Recognition
    Ozcan, Tayyip
    Basturk, Alper
    [J]. COMPUTER JOURNAL, 2021, 64 (11): : 1715 - 1730
  • [8] Classification of Deepfake Videos Using Pre-trained Convolutional Neural Networks
    Masood, MomMa
    Nawaz, Marriam
    Javed, Ali
    Nazir, Tahira
    Mehmood, Awais
    Mahum, Rabbia
    [J]. 2021 INTERNATIONAL CONFERENCE ON DIGITAL FUTURES AND TRANSFORMATIVE TECHNOLOGIES (ICODT2), 2021,
  • [9] Pre-trained convolutional neural networks as feature extractors for tuberculosis detection
    Lopes, U. K.
    Valiati, J. F.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 89 : 135 - 143
  • [10] Classification of Atrial Fibrillation with Pre-Trained Convolutional Neural Network Models
    Qayyum, Abdul
    Meriaudeau, Fabrice
    Chan, Genevieve C. Y.
    [J]. 2018 IEEE-EMBS CONFERENCE ON BIOMEDICAL ENGINEERING AND SCIENCES (IECBES), 2018, : 594 - 599