Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling

被引:11
|
作者
Thimmaraja Yadava, G. [1 ]
Jayanna, H. S. [2 ]
机构
[1] Sch Engn & Technol, Dept Elect & Commun Engn, Kanakapura Rd, Tumkur, Karnataka, India
[2] Siddaganga Inst Technol, Dept Informat Sci & Engg, Tumkur, Karnataka, India
关键词
Speech; Speech recognition; Interactive voice response system (IVRS); Automatic speech recognition (ASR); Spectral subtraction with voice activity detection (SS-VAD); Minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC); Minimum mean square error spectrum power (MMSE-SP); Maximum a Posteriori (MAP);
D O I
10.1007/s10772-020-09671-5
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, the improvements in the recently implemented Kannada speech recognition system is demonstrated in detail. The Kannada automatic speech recognition (ASR) system consists of ASR models which are created by using Kaldi, IVRS call flow and weather and agricultural commodity prices information databases. The task specific speech data used in the recently developed spoken dialogue system had high level of different background noises. The different types of noises present in collected speech data had an adverse effect on the on line and off line speech recognition performances. Therefore, to improve the speech recognition accuracy in Kannada ASR system, a noise reduction algorithm is developed which is a fusion of spectral subtraction with voice activity detection (SS-VAD) and minimum mean square error spectrum power estimator based on zero crossing (MMSE-SPZC) estimator. The noise elimination algorithm is added in the system before the feature extraction part. An alternative ASR models are created using subspace Gaussian mixture models (SGMM) and deep neural network (DNN) modeling techniques. The experimental results show that, the fusion of noise elimination technique and SGMM/DNN based modeling gives a better relative improvement of 7.68% accuracy compared to the recently developed GMM-HMM based ASR system. The least word error rate (WER) acoustic models could be used in spoken dialogue system. The developed spoken query system is tested from Karnataka farmers under uncontrolled environment.
引用
收藏
页码:149 / 167
页数:19
相关论文
共 50 条
  • [1] Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling
    G. Thimmaraja Yadava
    H. S. Jayanna
    [J]. International Journal of Speech Technology, 2020, 23 : 149 - 167
  • [2] Enhancements in Continuous Kannada ASR System by Background Noise Elimination
    Yadava, G. Thimmaraja
    Nagaraja, B. G.
    Jayanna, H. S.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (07) : 4041 - 4067
  • [3] Enhancements in Continuous Kannada ASR System by Background Noise Elimination
    G. Thimmaraja Yadava
    B. G. Nagaraja
    H. S. Jayanna
    [J]. Circuits, Systems, and Signal Processing, 2022, 41 : 4041 - 4067
  • [4] Noise Elimination in Degraded Kannada Speech Signal for Speech Recognition
    Yadava, Thimmaraja G.
    Prakash, Jai T. S.
    Jayanna, H. S.
    [J]. 2015 INTERNATIONAL CONFERENCE ON TRENDS IN AUTOMATION, COMMUNICATIONS AND COMPUTING TECHNOLOGY (I-TACT-15), 2015,
  • [5] Robust Automatic Speech Recognition System for the Recognition of Continuous Kannada Speech Sentences in the Presence of Noise
    Mahadevaswamy
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2023, 130 (03) : 2039 - 2058
  • [7] Amalgamation of noise elimination and TDNN acoustic modelling techniques for the advancements in continuous Kannada ASR system
    G. Thimmaraja Yadava
    B. G. Nagaraja
    H. S. Jayanna
    [J]. Multimedia Tools and Applications, 2024, 83 : 19953 - 19968
  • [8] Amalgamation of noise elimination and TDNN acoustic modelling techniques for the advancements in continuous Kannada ASR system
    Yadava, G. Thimmaraja
    Nagaraja, B. G.
    Jayanna, H. S.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 19953 - 19968
  • [9] Development of noise robust real time automatic speech recognition system for Kannada language/dialects
    Yadava, G. Thimmaraja
    Nagaraja, B. G.
    Jayanna, H. S.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 135
  • [10] Creating Language and Acoustic Models using Kaldi to Build An Automatic Speech Recognition System for Kannada Language
    Yadava, Thimmaraja G.
    Jayanna, H. S.
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2017, : 161 - 165