The impact of MFCC, spectrogram, and Mel-Spectrogram on deep learning models for Amazigh speech recognition system

被引:0
|
作者
Meryam Telmem [1 ]
Naouar Laaidi [2 ]
Hassan Satori [2 ]
机构
[1] Université Moulay Ismail de Meknes,
[2] Sidi Mohamed Ben Abdellah University,undefined
关键词
MFCC; Spectrogram; Mel-Spectrogram; CNN; LSTM; bi-LSTM; Amazigh language;
D O I
10.1007/s10772-025-10183-3
中图分类号
学科分类号
摘要
Feature extraction is an essential phase in the development of Automatic Speech Recognition (ASR) systems. This study examines the performance of different deep neural network architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), and (bi-LSTM) models for the Amazigh speech recognition system. When applied a several of feature extraction techniques, specifically Mel-Frequency Cepstral Coefficients (MFCC), Spectrograms, and Mel-Spectrograms, on the performance of different. The results show that the Bi-LSTM with Spectrograms achieved a maximum accuracy of 85%, giving the best performance in our Amazigh Speech Recognition (ASR) study. and we show that each feature type offers specific advantages, influenced by the particular neural network architecture employed.
引用
收藏
页码:299 / 312
页数:13
相关论文
共 50 条
  • [1] Amazigh CNN speech recognition system based on Mel spectrogram feature extraction method
    Boulal H.
    Hamidi M.
    Abarkan M.
    Barkani J.
    International Journal of Speech Technology, 2024, 27 (01) : 287 - 296
  • [2] MelTrans: Mel-Spectrogram Relationship-Learning for Speech Emotion Recognition via Transformers
    Li, Hui
    Li, Jiawen
    Liu, Hai
    Liu, Tingting
    Chen, Qiang
    You, Xinge
    SENSORS, 2024, 24 (17)
  • [3] Cough Recognition Based on Mel-Spectrogram and Convolutional Neural Network
    Zhou, Quan
    Shan, Jianhua
    Ding, Wenlong
    Wang, Chengyin
    Yuan, Shi
    Sun, Fuchun
    Li, Haiyuan
    Fang, Bin
    FRONTIERS IN ROBOTICS AND AI, 2021, 8
  • [4] Automatic Classification of Bird Sounds: Using MFCC and Mel Spectrogram Features with Deep Learning
    Carvalho, Silvestre
    Gomes, Elsa Ferreira
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (01) : 39 - 54
  • [5] GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
    Juvela, Lauri
    Bollepalli, Bajibabu
    Yamagishi, Junichi
    Alku, Paavo
    INTERSPEECH 2019, 2019, : 694 - 698
  • [6] Mel-spectrogram and Deep CNN Based Representation Learning from Bio-Sonar Implementation on UAVs
    Tanveer, M. Hassan
    Zhu, Hongxiao
    Ahmed, Waqar
    Thomas, Antony
    Imran, Basit Muhammad
    Salman, Muhammad
    2021 INTERNATIONAL CONFERENCE ON COMPUTER, CONTROL AND ROBOTICS (ICCCR 2021), 2021, : 220 - 224
  • [7] AN EXPLORATION OF LOG-MEL SPECTROGRAM AND MFCC FEATURES FOR ALZHEIMER'S DEMENTIA RECOGNITION FROM SPONTANEOUS SPEECH
    Meghanani, Amit
    Anoop, C. S.
    Ramakrishnan, A. G.
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 670 - 677
  • [8] On the Effect of Log-Mel Spectrogram Parameter Tuning for Deep Learning-Based Speech Emotion Recognition
    Mukhamediya, Azamat
    Fazli, Siamac
    Zollanvari, Amin
    IEEE ACCESS, 2023, 11 : 61950 - 61957
  • [9] Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recognition
    Lambamo, Wondimu
    Srinivasagan, Ramasamy
    Jifara, Worku
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [10] VGGish transfer learning model for the efficient detection of payload weight of drones using Mel-spectrogram analysis
    El-Latif E.I.A.
    El-Sayad N.E.
    Mohammed K.K.
    Darwish A.
    Hassanien A.E.
    Neural Computing and Applications, 2024, 36 (21) : 12883 - 12899