ROBUST FRONT-END PROCESSING FOR SPEECH RECOGNITION IN NOISY CONDITIONS

被引:0
|
作者
Das, Biswajit [1 ]
Panda, Ashish [1 ]
机构
[1] TCS Innovat Labs, Yantra Pk, Thana 400601, Maharashtra, India
关键词
Noise robust speech recognition; Auditory Masking; Vector Taylor series; Root Compression; Frame Suitability Measure; ENHANCEMENT; FEATURES;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the applicability and effectiveness of advanced feature compensation techniques in devising a robust front-end for Automatic Speech Recognition (ASR). First, the Vector Taylor Series (VTS) equations are altered by bringing in the auditory masking factor. The resultant VTS approximation is used to compensate the parameters of a clean speech model and a Minimum Mean Square Error (MMSE) estimate is used to estimate the clean speech features from noisy features. Second, we apply root-compression instead of conventional log-compression to the mel-filter banks energy. Third, we apply a frame selection method to eliminate the noise dominated frames to improve the performance in high noise scenarios. The proposed algorithms are validated on noise corrupted Librispeech and TIMIT speech recognition databases and are shown to provide significant gain in performance.
引用
收藏
页码:5235 / 5239
页数:5
相关论文
共 50 条
  • [1] Robust Front-End Processing For Emotion Recognition In Noisy Speech
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Kopparapu, Sunil Kumar
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 324 - 328
  • [2] A Front-End Technique for Automatic Noisy Speech Recognition
    Naing, Hay Mar Soe
    Hidayat, Risanuri
    Hartanto, Rudy
    Miyanaga, Yoshikazu
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 49 - 54
  • [3] Robust Front-End based on MVA processing for Arabic Speech Recognition
    Techini, Elhem
    Sakka, Zied
    Bouhlel, MedSalim
    [J]. 2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
  • [4] A robust front-end for telephone speech recognition
    Cho, HY
    Chi, SM
    Oh, YH
    [J]. PRICAI'98: TOPICS IN ARTIFICIAL INTELLIGENCE, 1998, 1531 : 636 - 644
  • [5] A biological front-end processing for speech recognition
    Ferrandez, JM
    del Valle, D
    Rodellar, V
    Gomez, P
    [J]. BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 1058 - 1067
  • [6] Investigation into a Mel subspace based front-end processing for robust speech recognition
    Selouani, SA
    O'Shaughnessy, D
    [J]. Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 187 - 190
  • [7] A comparison of front-end configurations for robust speech recognition
    Milner, B
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 797 - 800
  • [8] Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition
    Narayanan, Arun
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 826 - 835
  • [9] A Front-End Speech Enhancement System for Robust Automotive Speech Recognition
    Wang, Haikun
    Ye, Zhongfu
    Chen, Jingdong
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 1 - 5
  • [10] A new approach to variable frame rate front-end processing for robust speech recognition
    Epps, J
    [J]. ISSPA 2005: The 8th International Symposium on Signal Processing and its Applications, Vols 1 and 2, Proceedings, 2005, : 723 - 726