Continuous Tamil Speech Recognition technique under non stationary noisy environments

被引:4
|
作者
Kalamani, M. [1 ]
Krishnamoorthi, M. [2 ]
Valarmathi, R. S. [1 ]
机构
[1] Bannari Amman Inst Technol, Dept Elect & Commun Engn, Sathyamangalam 638401, Tamil Nadu, India
[2] Bannari Amman Inst Technol, Dept Comp Sci & Engn, Sathyamangalam 638401, Tamil Nadu, India
关键词
Noise estimation; Speech enhancement; Speech segmentation; MFCC; FCM; EM-GMM; CTSR; Noisy environments; FILTERED-X LMS; FEEDBACK CANCELLATION; ALGORITHM; MODELS;
D O I
10.1007/s10772-018-09580-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the last few years, the need for Continuous Speech Recognition system in Tamil language has been increased widely. In this research work, efficient Continuous Tamil Speech Recognition (CTSR) technique is proposed under non stationary noisy environments. This research work consists of two stages such as speech enhancement and modelling phase. In this, the modified Modulation Magnitude Estimation based Spectral Subtraction with Chi-Square Distribution based Noise Estimation (SS-NE) algorithm is proposed to enhance the noisy Tamil speech signal under various non-stationary noise environments. In order to extract the speech segments from the continuous speech, further the enhanced speech signal is segmented through the combination of short-time signal energy and spectral centroid features of the signal. In this work, 26 mel frequency cepstral coefficients per frame are found as optimal values and they are considered as acoustic feature vectors for each frame. In this research work, the Fuzzy C-Means (FCM) clustering is used in order to cluster the extracted feature vectors into discrete symbols. From the evaluation results, it is found that the optimal number of clusters C' as 5. Finally, Tamil speech from various speakers is recognized using Expectation Maximization Gaussian Mixture Model (EM-GMM) with 16 component densities under continuous measurements of labelled features from FCM clustering techniques in order to reduce the word error rate. From the simulated results, it is observed that the proposed FCM with EM-GMM model for CTSR improves the recognition accuracy from 1.2 to 4.4% when compared to the existing algorithms under different noisy environments by reducing the WER from 1.6 to 5.47%.
引用
收藏
页码:47 / 58
页数:12
相关论文
共 50 条
  • [1] Continuous Tamil Speech Recognition technique under non stationary noisy environments
    M. Kalamani
    M. Krishnamoorthi
    R. S. Valarmathi
    [J]. International Journal of Speech Technology, 2019, 22 : 47 - 58
  • [2] Continuous speech recognition under non-stationary musical environments based on speech state transition model
    Fujimoto, M
    Ariki, Y
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 297 - 300
  • [3] Swarm Intelligence Based Feature Clustering for Continuous Speech Recognition Under Noisy Environments
    Kalamani, M.
    Krishnamoorthi, M.
    Harikumar, R.
    Valarmathi, R. S.
    [J]. COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 1248 - 1255
  • [4] Speech enhancement strategy for speech recognition microcontroller under noisy environments
    Chan, Kit Yan
    Nordholm, Sven
    Yiu, Ka Fai Cedric
    Togneri, Roberto
    [J]. NEUROCOMPUTING, 2013, 118 : 279 - 288
  • [5] Performance of a wavelet-based frontend under typical noisy environments for continuous speech recognition
    Sujatha, J
    Kumar, KRP
    Ramakrishnan, KR
    Balakrishnan, N
    [J]. PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2002, : 196 - 199
  • [6] Speech recognition in non-stationary adverse environments
    Wang, ZH
    Kenny, P
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 265 - 268
  • [7] SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY
    GONG, YF
    [J]. SPEECH COMMUNICATION, 1995, 16 (03) : 261 - 291
  • [8] ROBUST SPEECH RECOGNITION UNDER NOISY ENVIRONMENTS USING ASYMMETRIC TAPERS
    Alam, Md Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    [J]. 2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1638 - 1642
  • [9] Speech enhancement applied to speech recognition in noisy environments
    [J]. Xu, Y.F., 2001, Press of Tsinghua University (41):
  • [10] Continuous Kannada Noisy Speech Recognition
    Pasha, Nadeem
    Roopa, S.
    [J]. 2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 857 - 861