Exploring data augmentation for Amazigh speech recognition with convolutional neural networks

被引:0
|
作者
Hossam Boulal [1 ]
Farida Bouroumane [1 ]
Mohamed Hamidi [2 ]
Jamal Barkani [1 ]
Mustapha Abarkan [1 ]
机构
[1] FP Taza,LSI Laboratory
[2] USMBA University,Team of Modeling and Scientific Computing
[3] FPN,undefined
[4] UMP,undefined
关键词
Speech recognition; Data augmentation; Deep learning; Feature extraction; Amazigh digits;
D O I
10.1007/s10772-024-10164-y
中图分类号
学科分类号
摘要
In the field of speech recognition, enhancing accuracy is paramount for diverse linguistic communities. Our study addresses this necessity, focusing on improving Amazigh speech recognition through the implementation of three distinct data augmentation methods: Audio Augmentation, FilterBank Augmentation, and SpecAugment. Leveraging Convolutional Neural Networks (CNNs) for speech recognition, we utilize Mel Spectrograms extracted from audio files. The study specifically targets the recognition of the initial ten Amazigh digits. We conducted experiments with a speaker-independent approach involving 42 participants. A total of 27 experiments were conducted, utilizing both original and augmented data. Among the different CNN models employed, the VGG19 model showcased significant promise. Our results demonstrate a maximum accuracy of 95.66%. Furthermore, the most notable improvement achieved through data augmentation was 4.67%. These findings signify a substantial enhancement in speech recognition accuracy, indicating the efficacy of the proposed methods.
引用
收藏
页码:53 / 65
页数:12
相关论文
共 50 条
  • [41] Convolutional Maxout Neural Networks for Low-Resource Speech Recognition
    Cai, Meng
    Shi, Yongzhe
    Kang, Jian
    Liu, Jia
    Su, Tengrong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 133 - +
  • [42] Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
    Simic, Nikola
    Suzic, Sinisa
    Nosek, Tijana
    Vujovic, Mia
    Peric, Zoran
    Savic, Milan
    Delic, Vlado
    ENTROPY, 2022, 24 (03)
  • [43] Factored deep convolutional neural networks for noise robust speech recognition
    Fujimoto, Masakiyo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3837 - 3841
  • [44] Bangla Short Speech Commands Recognition Using Convolutional Neural Networks
    Sumon, Shakil Ahmed
    Chowdhury, Joydip
    Debnath, Sujit
    Mohammed, Nabeel
    Momen, Sifat
    2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [45] Enhancing Speech Emotion Recognition Using Deep Convolutional Neural Networks
    Islam, M. M. Manjurul
    Kabir, Md Alamgir
    Sheikh, Alamin
    Saiduzzaman, Muhammad
    Hafid, Abdelakram
    Abdullah, Saad
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2024, 2024, : 95 - 100
  • [46] Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
    Zielonka, Marta
    Piastowski, Artur
    Czyzewski, Andrzej
    Nadachowski, Pawel
    Operlejn, Maksymilian
    Kaczor, Kamil
    ELECTRONICS, 2022, 11 (22)
  • [47] Maxout neurons for deep convolutional and LSTM neural networks in speech recognition
    Cai, Meng
    Liu, Jia
    SPEECH COMMUNICATION, 2016, 77 : 53 - 64
  • [48] Improvement on Speech Emotion Recognition Based on Deep Convolutional Neural Networks
    Niu, Yafeng
    Zou, Dongsheng
    Niu, Yadong
    He, Zhongshi
    Tan, Hua
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON COMPUTING AND ARTIFICIAL INTELLIGENCE (ICCAI 2018), 2018, : 13 - 18
  • [49] Recognition of Emotional Speech with Convolutional Neural Networks by Means of Spectral Estimates
    Weisskirchen, Norman
    Boeck, Ronald
    Wendemuth, Andreas
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2017, : 50 - 55
  • [50] Speech Emotion Recognition using Convolutional Recurrent Neural Networks and Spectrograms
    Qamhan, Mustafa A.
    Meftah, Ali H.
    Selouani, Sid-Ahmed
    Alotaibi, Yousef A.
    Zakariah, Mohammed
    Seddiq, Yasser Mohammad
    2020 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2020,