Joint decoding of multiple speech patterns for robust speech recognition

被引:0
|
作者
Nair, Nishanth Ulhas [1 ]
Sreenivas, T. V. [1 ]
机构
[1] Indian Inst Sci, Dept Elect Commun Engn, Bangalore 560012, Karnataka, India
来源
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2 | 2007年
关键词
Robust speech recognition; Viterbi Algorithm; Dynamic Time Warping; burst noise;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.
引用
收藏
页码:93 / 98
页数:6
相关论文
共 50 条
  • [21] Joint Training of Speech Separation, Filterbank and Acoustic Model for Robust Automatic Speech Recognition
    Wang, Zhong-Qiu
    Wang, DeLiang
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2839 - 2843
  • [22] Speech parameters for the robust emotional speech recognition
    Kim W.-G.
    Journal of Institute of Control, Robotics and Systems, 2010, 16 (12) : 1137 - 1142
  • [23] Joint CTC/attention decoding for end-to-end speech recognition
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
  • [24] Japanese speech databases for robust speech recognition
    Nakamura, A
    Matsunaga, S
    Shimizu, T
    Tonomura, M
    Sagisaka, Y
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2199 - 2202
  • [25] Robust speech detector for speech recognition applications
    Liang, WQ
    Chen, YN
    Shan, YX
    Liu, J
    Liu, RS
    2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 453 - 456
  • [26] Robust parallel speech recognition in multiple energy bands
    Maier, A
    Hacker, C
    Steidl, S
    Nöth, E
    Niemann, H
    PATTERN RECOGNITION, PROCEEDINGS, 2005, 3663 : 133 - 140
  • [27] Multiple resolution analysis for robust automatic speech recognition
    Gemello, R
    Mana, F
    Albesano, D
    De Mori, R
    COMPUTER SPEECH AND LANGUAGE, 2006, 20 (01): : 2 - 21
  • [28] Joint model and feature space optimization for robust speech recognition
    Hwang, JN
    Wang, CJ
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 855 - 858
  • [29] A Global Discriminant Joint Training Framework for Robust Speech Recognition
    Li, Lujun
    Kuerzinger, Ludwig
    Watzel, Tobias
    Rigoll, Gerhard
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 544 - 551
  • [30] JOINT NOISE ADAPTIVE TRAINING FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Narayanan, Arun
    Wang, DeLiang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,