Incorporating Noise Robustness in Speech Command Recognition by Noise Augmentation of Training Data

被引:32
|
作者
Pervaiz, Ayesha [1 ]
Hussain, Fawad [1 ]
Israr, Huma [2 ]
Tahir, Muhammad Ali [2 ]
Raja, Fawad Riasat [3 ]
Baloch, Naveed Khan [1 ]
Ishmanov, Farruh [4 ]
Zikria, Yousaf Bin [5 ]
机构
[1] Univ Engn & Technol, Dept Comp Engn, Taxila 47050, Pakistan
[2] Natl Univ Sci & Technol NUST, Sch Elect Engn & Comp Sci SEECS, H-12, Islamabad, Pakistan
[3] Griffith Univ, Machine Intelligence & Pattern Anal Lab, Nathan, Qld 4111, Australia
[4] Kwangwoon Univ, Dept Elect & Commun Engn, Seoul 4471, South Korea
[5] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan 38541, South Korea
关键词
automatic speech recognition; voice recognition; acoustic modelling; language modelling; deep learning; deep neural networks; word error rate; data science; speech command set; kaldi; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.3390/s20082326
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The advent of new devices, technology, machine learning techniques, and the availability of free large speech corpora results in rapid and accurate speech recognition. In the last two decades, extensive research has been initiated by researchers and different organizations to experiment with new techniques and their applications in speech processing systems. There are several speech command based applications in the area of robotics, IoT, ubiquitous computing, and different human-computer interfaces. Various researchers have worked on enhancing the efficiency of speech command based systems and used the speech command dataset. However, none of them catered to noise in the same. Noise is one of the major challenges in any speech recognition system, as real-time noise is a very versatile and unavoidable factor that affects the performance of speech recognition systems, particularly those that have not learned the noise efficiently. We thoroughly analyse the latest trends in speech recognition and evaluate the speech command dataset on different machine learning based and deep learning based techniques. A novel technique is proposed for noise robustness by augmenting noise in training data. Our proposed technique is tested on clean and noisy data along with locally generated data and achieves much better results than existing state-of-the-art techniques, thus setting a new benchmark.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] In domain training data augmentation on noise robust Punjabi Children speech recognition
    Virender Kadyan
    Puneet Bawa
    Taniya Hasija
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 2705 - 2721
  • [2] In domain training data augmentation on noise robust Punjabi Children speech recognition
    Kadyan, Virender
    Bawa, Puneet
    Hasija, Taniya
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (5) : 2705 - 2721
  • [3] Adding Noise to Improve Noise Robustness in Speech Recognition
    Morales, Nicolas
    Gu, Liang
    Gao, Yuqing
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 861 - +
  • [4] Toward noise robustness speech recognition
    Namarvar, HH
    Liaw, J
    Berger, TW
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4016 - 4016
  • [5] Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
    Eickhoff, Patrick
    Moeller, Matthias
    Rosin, Theresa Pekarek
    Twiefel, Johannes
    Wermter, Stefan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 376 - 388
  • [6] Noise Robustness of Tract Variables and their Application to Speech Recognition
    Mitra, Vikramjit
    Nam, Hosung
    Espy-Wilson, Carol
    Saltzman, Elliot
    Goldstein, Louis
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2735 - +
  • [7] Improving Noise Robustness of Speech Emotion Recognition System
    Juszkiewicz, Lukasz
    [J]. INTELLIGENT DISTRIBUTED COMPUTING VII, 2014, 511 : 223 - 232
  • [8] GENERATIVE ADVERSARIAL NETWORKS BASED DATA AUGMENTATION FOR NOISE ROBUST SPEECH RECOGNITION
    Hu, Hu
    Tan, Tian
    Qian, Yanmin
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5044 - 5048
  • [9] Noise and speaker robustness in a Persian continuous speech recognition system
    Veisi, Hadi
    Sameti, Hossein
    [J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 73 - 76
  • [10] Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition
    Hansen, JHL
    [J]. SPEECH COMMUNICATION, 1996, 20 (1-2) : 151 - 173