On including temporal constraints in Viterbi alignment for speech recognition in noise

被引:14
|
作者
Yoma, NB [1 ]
McInnes, FR [1 ]
Jack, MA [1 ]
Stump, SD [1 ]
Ling, LL [1 ]
机构
[1] Univ Chile, Dept Elect Engn, Santiago, Chile
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2001年 / 9卷 / 02期
关键词
duration modeling; noise robustness; speech recognition;
D O I
10.1109/89.902285
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses the problem of temporal constraints in the Viterbi algorithm in speaker-dependent and independent tasks. The results here presented suggest that in a speaker-dependent task the introduction of temporal constraints can lead to a high improvement with additive or convolutional noise, the statistical modeling of state durations is not relevant if the max and min state duration restrictions are imposed, and truncated probability densities give better results than a metric previously proposed. Finally, word position dependent and independent temporal restrictions are compared in connected word speech recognition experiments and it is shown that the former leads to better results with the same computational load. However, duration model effect could be much less significant when the acoustic model is optimized and when the training and testing conditions are matched.
引用
收藏
页码:179 / 182
页数:4
相关论文
共 50 条
  • [21] SEARCH ERROR RISK MINIMIZATION IN VITERBI BEAM SEARCH FOR SPEECH RECOGNITION
    Hori, Takaaki
    Watanabe, Shinji
    Nakamura, Atsushi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4934 - 4937
  • [22] Constraints on the recognition of words in continuous speech
    McQueen, JM
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 39 - 39
  • [23] Reduced Memory Viterbi Decoding for Hardware-accelerated Speech Recognition
    Raj, Pani Prithvi
    Reddy, Pakala Akhil
    Chandrachoodan, Nitin
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (03)
  • [24] Informative Spectro-Temporal Bottleneck Features for Noise-Robust Speech Recognition
    Chang, Shuo-Yiin
    Morgan, Nelson
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 99 - 103
  • [25] Analysis of Speech and Singing Signals for Temporal Alignment
    Vijayan, Karthika
    Gao, Xiaoxue
    Li, Haizhou
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1893 - 1898
  • [26] ON TEMPORAL ALIGNMENT OF SENTENCES OF NATURAL AND SYNTHETIC SPEECH
    HOHNE, HD
    COKER, C
    LEVINSON, SE
    RABINER, LR
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1983, 31 (04): : 807 - 813
  • [27] Vector interpolation for time alignment in speech recognition
    Yfantis, EA
    Elison, JD
    BOUNDARY ELEMENT TECHNOLOGY XIII: INCORPORATING COMPUTATIONAL METHODS AND TESTING FOR ENGINEERING INTEGRITY, 1999, 2 : 417 - 422
  • [28] Phonetic alignment:: speech synthesis-based vs. Viterbi-based
    Malfrère, F
    Deroo, O
    Dutoit, T
    Ris, C
    SPEECH COMMUNICATION, 2003, 40 (04) : 503 - 515
  • [29] Adding Noise to Improve Noise Robustness in Speech Recognition
    Morales, Nicolas
    Gu, Liang
    Gao, Yuqing
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 861 - +
  • [30] An improved noise compensation algorithm for speech recognition in noise
    Yang, RK
    Haavisto, P
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 49 - 52