Exploring data augmentation for Amazigh speech recognition with convolutional neural networks

被引:0
|
作者
Hossam Boulal [1 ]
Farida Bouroumane [1 ]
Mohamed Hamidi [2 ]
Jamal Barkani [1 ]
Mustapha Abarkan [1 ]
机构
[1] FP Taza,LSI Laboratory
[2] USMBA University,Team of Modeling and Scientific Computing
[3] FPN,undefined
[4] UMP,undefined
关键词
Speech recognition; Data augmentation; Deep learning; Feature extraction; Amazigh digits;
D O I
10.1007/s10772-024-10164-y
中图分类号
学科分类号
摘要
In the field of speech recognition, enhancing accuracy is paramount for diverse linguistic communities. Our study addresses this necessity, focusing on improving Amazigh speech recognition through the implementation of three distinct data augmentation methods: Audio Augmentation, FilterBank Augmentation, and SpecAugment. Leveraging Convolutional Neural Networks (CNNs) for speech recognition, we utilize Mel Spectrograms extracted from audio files. The study specifically targets the recognition of the initial ten Amazigh digits. We conducted experiments with a speaker-independent approach involving 42 participants. A total of 27 experiments were conducted, utilizing both original and augmented data. Among the different CNN models employed, the VGG19 model showcased significant promise. Our results demonstrate a maximum accuracy of 95.66%. Furthermore, the most notable improvement achieved through data augmentation was 4.67%. These findings signify a substantial enhancement in speech recognition accuracy, indicating the efficacy of the proposed methods.
引用
收藏
页码:53 / 65
页数:12
相关论文
共 50 条
  • [1] Effective Data Augmentation Techniques for Arabic Speech Emotion Recognition Using Convolutional Neural Networks
    Bouchelligua, Wided
    Al-Dayil, Reham
    Algaith, Areej
    APPLIED SCIENCES-BASEL, 2025, 15 (04):
  • [2] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [3] Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition
    Takahashi, Naoya
    Gygli, Michael
    Pfister, Beat
    Van Goole, Luc
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2982 - 2986
  • [4] Continuous speech recognition by convolutional neural networks
    Zhang, Qing-Qing
    Liu, Yong
    Pan, Jie-Lin
    Yan, Yong-Hong
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2015, 37 (09): : 1212 - 1217
  • [5] Convolutional Neural Networks for Distant Speech Recognition
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1120 - 1124
  • [6] AN ANALYSIS OF CONVOLUTIONAL NEURAL NETWORKS FOR SPEECH RECOGNITION
    Huang, Jui-Ting
    Li, Jinyu
    Gong, Yifan
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4989 - 4993
  • [7] Speech Recognition Based on Convolutional Neural Networks
    Du Guiming
    Wang Xia
    Wang Guangyan
    Zhang Yan
    Li Dan
    2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2016, : 708 - 711
  • [8] Data Augmentation for EEG-Based Emotion Recognition with Deep Convolutional Neural Networks
    Wang, Fang
    Zhong, Sheng-hua
    Peng, Jianfeng
    Jiang, Jianmin
    Liu, Yan
    MULTIMEDIA MODELING, MMM 2018, PT II, 2018, 10705 : 82 - 93
  • [9] Deep Convolutional Neural Networks Based on Image Data Augmentation for Visual Object Recognition
    Jayech, Khaoula
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 476 - 485
  • [10] DATA AUGMENTATION WITH GABOR FILTER IN DEEP CONVOLUTIONAL NEURAL NETWORKS FOR SAR TARGET RECOGNITION
    Jiang, Ting
    Cui, Zongyong
    Zhou, Zhi
    Cao, Zongjie
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 689 - 692