Multiple resolution analysis for robust automatic speech recognition

被引:7
|
作者
Gemello, R
Mana, F
Albesano, D
De Mori, R
机构
[1] Loquendo, I-10149 Turin, Italy
[2] Univ Avignon, F-84911 Avignon, France
来源
COMPUTER SPEECH AND LANGUAGE | 2006年 / 20卷 / 01期
关键词
D O I
10.1016/j.csl.2004.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the potential of exploiting the redundancy implicit in multiple resolution analysis for automatic speech recognition systems. The analysis is performed by a binary tree of elements, each one of which is made by a half-band filter followed by a down sampler which discards odd samples. Filter design and feature computation from samples are discussed and recognition performance with different choices is presented. A paradigm consisting in redundant feature extraction, followed by feature normalization, followed by dimensionality reduction is proposed. Feature normalization is performed by denoising algorithms. Two of them are considered and evaluated, namely, signal-to-noise ratio-dependent spectral subtraction and soft thresholding. Dimensionality reduction is performed with principal component analysis. Experiments using telephone corpora and the Aurora3 corpus are reported. They indicate that the proposed paradigm leads to a recognition performance with clean speech, measured in word error rate, marginally superior to the one obtained with perceptual linear prediction coefficients. Nevertheless, performance of the proposed analysis paradigm is significantly superior when used with noisy data and the same denoising algorithm is applied to all the analysis methods, which are compared. (c) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2 / 21
页数:20
相关论文
共 50 条
  • [1] An Analysis of Automatic Speech Recognition with Multiple Microphones
    Marino, Davide
    Hain, Thomas
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1288 - 1291
  • [2] Environmental Noise Analysis for Robust Automatic Speech Recognition
    Kishore, N. Sai Bala
    Venkata, M. Rao
    Nagamani, M.
    [J]. ADVANCED COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY, 2015, 315
  • [3] Noise robust automatic speech recognition: review and analysis
    Dua M.
    Akanksha
    Dua S.
    [J]. International Journal of Speech Technology, 2023, 26 (02) : 475 - 519
  • [4] A robust speech analysis in speech recognition
    Miyanaga, Y
    Gozen, S
    Ohtsuki, N
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 706 - 709
  • [5] FEATURE EXTRACTION WITH A MULTISCALE MODULATION ANALYSIS FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Mueller, Florian
    Mertins, Alfred
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7427 - 7431
  • [6] A distributed architecture for robust automatic speech recognition
    Hacioglu, K
    Pellom, B
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 328 - 331
  • [7] An efficient algorithm for automatic robust speech recognition
    Kotnik, Bojan
    Kačič, Zdravko
    Horvat, Bogomir
    [J]. Elektrotehniski Vestnik/Electrotechnical Review, 2002, 69 (01): : 69 - 74
  • [8] ROBUST AUTOMATIC RECOGNITION OF SPEECH WITH BACKGROUND MUSIC
    Malek, Jiri
    Zdansky, Jindrich
    Cerva, Petr
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5210 - 5214
  • [9] Integration of fixed and multiple resolution analysis in a speech recognition system
    Gemello, R
    Albesano, D
    Moisa, L
    De Mori, R
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 121 - 124
  • [10] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153