Multiple resolution analysis for robust automatic speech recognition

被引:7
|
作者
Gemello, R
Mana, F
Albesano, D
De Mori, R
机构
[1] Loquendo, I-10149 Turin, Italy
[2] Univ Avignon, F-84911 Avignon, France
来源
COMPUTER SPEECH AND LANGUAGE | 2006年 / 20卷 / 01期
关键词
D O I
10.1016/j.csl.2004.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the potential of exploiting the redundancy implicit in multiple resolution analysis for automatic speech recognition systems. The analysis is performed by a binary tree of elements, each one of which is made by a half-band filter followed by a down sampler which discards odd samples. Filter design and feature computation from samples are discussed and recognition performance with different choices is presented. A paradigm consisting in redundant feature extraction, followed by feature normalization, followed by dimensionality reduction is proposed. Feature normalization is performed by denoising algorithms. Two of them are considered and evaluated, namely, signal-to-noise ratio-dependent spectral subtraction and soft thresholding. Dimensionality reduction is performed with principal component analysis. Experiments using telephone corpora and the Aurora3 corpus are reported. They indicate that the proposed paradigm leads to a recognition performance with clean speech, measured in word error rate, marginally superior to the one obtained with perceptual linear prediction coefficients. Nevertheless, performance of the proposed analysis paradigm is significantly superior when used with noisy data and the same denoising algorithm is applied to all the analysis methods, which are compared. (c) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2 / 21
页数:20
相关论文
共 50 条
  • [21] Comparative analysis of Dysarthric speech recognition: multiple features and robust templates
    Revathi, Arunachalam
    Nagakrishnan, R.
    Sasikaladevi, N.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) : 31245 - 31259
  • [22] Comparative analysis of Dysarthric speech recognition: multiple features and robust templates
    Arunachalam Revathi
    R. Nagakrishnan
    N. Sasikaladevi
    [J]. Multimedia Tools and Applications, 2022, 81 : 31245 - 31259
  • [23] Noise Adaptive Training for Robust Automatic Speech Recognition
    Kalinli, Ozlem
    Seltzer, Michael L.
    Droppo, Jasha
    Acero, Alex
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
  • [24] Acoustic quality normalization for robust automatic speech recognition
    Muhammad G.
    [J]. International Journal of Speech Technology, 2007, 10 (4) : 175 - 182
  • [25] A Joint Training Framework for Robust Automatic Speech Recognition
    Wang, Zhong-Qiu
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (04) : 796 - 806
  • [26] Robust automatic speech recognition in impulsive noise environment
    Ding, P
    Cao, ZG
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2005, 14 (01): : 165 - 168
  • [27] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [28] Joint decoding of multiple speech patterns for robust speech recognition
    Nair, Nishanth Ulhas
    Sreenivas, T. V.
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 93 - 98
  • [29] Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition
    Moritz, Niko
    Adiloglu, Kamil
    Anemueller, Joern
    Goetze, Stefan
    Kollmeier, Birger
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 558 - 573