The regularized SNN-TA model for recognition of noisy speech

被引:2
|
作者
Trentin, E [1 ]
Matassoni, M [1 ]
机构
[1] ITC Irst, Trent, Italy
关键词
D O I
10.1109/IJCNN.2000.861441
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Segmental Neural Network (SNN) architecture was introduced at BBN by Zavaliagkos et al. for rescoring the N-best hypothesis yielded by a standard Continuous Density hidden Markov model (CDHMM) applied to Automatic Speech Recognition. An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA) is first used in this paper instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The present paradigm is applied to the recognition of isolated digits, collected in a real car environment under several noisy conditions (traffic, speed, road conditions, etc.) using a microphone placed far from the talker. We stress the fact that robustness to noise can be increased by improving the generalization capabilities of the speech recognizer. In this perspective, while CDHMMs completely lack of a proper regularization theory, a regularized SNN-TA model is discussed, which yields effective generalization and noise-tolerance, outperforming the CDHMM on the noisy task under consideration.
引用
收藏
页码:97 / 102
页数:6
相关论文
共 50 条
  • [31] SPEECH RECOGNITION IN THE NOISY CAR ENVIRONMENT
    RUEHL, HW
    DOBLER, S
    WEITH, J
    MEYER, P
    NOLL, A
    HAMER, HH
    PIOTROWSKI, H
    SPEECH COMMUNICATION, 1991, 10 (01) : 11 - 22
  • [32] Cepstrum-domain model combination based on decomposition of speech and noise for noisy speech recognition
    Kim, HK
    Rose, RC
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 209 - 212
  • [33] An effective cluster-based model for robust speech detection and speech recognition in noisy environments
    Gorriz, J. M.
    Ramirez, J.
    Segura, J. C.
    Puntonet, C. G.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (01): : 470 - 481
  • [34] An effective cluster-based model for robust speech detection and speech recognition in noisy environments
    Górriz, J.M.
    Ramírez, J.
    Segura, J.C.
    Puntonet, C.G.
    Journal of the Acoustical Society of America, 2006, 120 (01): : 470 - 481
  • [35] COMPARISON OF DIFFERENT SPEECH ENHANCEMENT METHODS ON RECOGNITION OF NOISY SPEECH
    AHMED, MS
    ALMARZOUG, AM
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1994, 19 (01): : 45 - 56
  • [36] Adaptive Parallel Model Combination for Reduced Environmental Mismatch in Noisy Speech Recognition
    Tan, S. S.
    Ahmad, Abdul Manan
    ICED: 2008 INTERNATIONAL CONFERENCE ON ELECTRONIC DESIGN, VOLS 1 AND 2, 2008, : 668 - 673
  • [37] Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement
    Schuller, Bjoern
    Woellmer, Martin
    Moosmayr, Tobias
    Rigoll, Gerhard
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
  • [38] An efficient joint training model for monaural noisy-reverberant speech recognition
    Lian, Xiaoyu
    Xia, Nan
    Dai, Gaole
    Yang, Hongqin
    APPLIED ACOUSTICS, 2025, 228
  • [39] Weighted likelihood ratio (WLR) Hidden Markov Model for noisy speech recognition
    Huang, Chao
    Huang, Yingchun
    Soong, Frank
    Zhou, Jianlai
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 37 - 40
  • [40] Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement
    Björn Schuller
    Martin Wöllmer
    Tobias Moosmayr
    Gerhard Rigoll
    EURASIP Journal on Audio, Speech, and Music Processing, 2009