Robust automatic speech recognition using PD-MEEMLIN

被引:0
|
作者
Hernandez, Igmar [1 ]
Garcia, Paola [1 ]
Nolazco, Juan [1 ]
Buera, Luis [2 ]
Lleida, Eduardo [2 ]
机构
[1] Tecnol Monterrey, Dept Comp Sci, Campus Monterrey, Monterrey, Mexico
[2] Univ Zaragoza, Commun Technol Grp GTC, I3A, Zaragoza, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work presents a robust normalization technique by cascading a speech enhancement method followed by a feature vector normalization algorithm. To provide speech enhancement the Spectral Subtraction (SS) algorithm is used; this method reduces the effect of additive noise by performing a subtraction of the noise spectrum estimate over the complete speech spectrum. On the other hand, an empirical feature vector normalization technique known as PD-MEMLIN (Phoneme-Dependent Multi-Enviroment Models based LInear Normalization) has also shown to be effective. PD-MEMLIN models clean and noisy spaces employing Gaussian Mixture Models (GMMs), and estimates a set of linear compensation transformations to be used to clean the signal. The proper integration of both approaches is studied and the final design, PD-MEEMLIN (Phoneme-Dependent Multi-Enviroment Enhanced Models based LInear Normalization), confirms and improves the effectiveness of both approaches. The results obtained show that in very high degraded speech PD-MEEMLIN outperforms the SS by a range between 11.4 % and 34.5 %, and for PD-MEMLIN by a range between 11.7 % and 24.84 %. Furthemore, in moderate SNR, i.e. 15 or 20 dB, PD-MEEMLIN is as good as PD-MEMLIN and SS techniques.
引用
收藏
页码:1 / +
页数:2
相关论文
共 50 条
  • [1] A distributed architecture for robust automatic speech recognition
    Hacioglu, K
    Pellom, B
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 328 - 331
  • [2] An efficient algorithm for automatic robust speech recognition
    Kotnik, Bojan
    Kačič, Zdravko
    Horvat, Bogomir
    [J]. Elektrotehniski Vestnik/Electrotechnical Review, 2002, 69 (01): : 69 - 74
  • [3] ROBUST AUTOMATIC RECOGNITION OF SPEECH WITH BACKGROUND MUSIC
    Malek, Jiri
    Zdansky, Jindrich
    Cerva, Petr
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5210 - 5214
  • [4] Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures
    Moore, A. H.
    Parada, P. Peso
    Naylor, P. A.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 574 - 584
  • [5] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153
  • [6] Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition
    Sim, Khe Chai
    Narayanan, Arun
    Misra, Ananya
    Tripathi, Anshuman
    Pundak, Golan
    Sainath, Tara N.
    Haghani, Parisa
    Li, Bo
    Bacchiani, Michiel
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 892 - 896
  • [7] SPEAKER REINFORCEMENT USING TARGET SOURCE EXTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Zorila, Catalin
    Doddipatla, Rama
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6297 - 6301
  • [8] HMM Adaptation using Statistical Linear Approximation for Robust Automatic Speech Recognition
    Berkovitch, Michael
    Shallom, Ilan D.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1301 - 1304
  • [9] Comparative Evaluation of Speech Enhancement Methods for Robust Automatic Speech Recognition
    Paliwal, Kuldip K.
    Lyons, James G.
    So, Stephen
    Stark, Anthony P.
    Wojcicki, Kamil K.
    [J]. 2010 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2010,
  • [10] Environmental Noise Analysis for Robust Automatic Speech Recognition
    Kishore, N. Sai Bala
    Venkata, M. Rao
    Nagamani, M.
    [J]. ADVANCED COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY, 2015, 315