On including temporal constraints in Viterbi alignment for speech recognition in noise

被引：14

作者：

Yoma, NB ^{[1
]}

McInnes, FR ^{[1
]}

Jack, MA ^{[1
]}

Stump, SD ^{[1
]}

Ling, LL ^{[1
]}

机构：

[1] Univ Chile, Dept Elect Engn, Santiago, Chile

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2001年 / 9卷 / 02期

关键词：

duration modeling; noise robustness; speech recognition;

D O I：

10.1109/89.902285

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper addresses the problem of temporal constraints in the Viterbi algorithm in speaker-dependent and independent tasks. The results here presented suggest that in a speaker-dependent task the introduction of temporal constraints can lead to a high improvement with additive or convolutional noise, the statistical modeling of state durations is not relevant if the max and min state duration restrictions are imposed, and truncated probability densities give better results than a metric previously proposed. Finally, word position dependent and independent temporal restrictions are compared in connected word speech recognition experiments and it is shown that the former leads to better results with the same computational load. However, duration model effect could be much less significant when the acoustic model is optimized and when the training and testing conditions are matched.

引用

页码：179 / 182

页数：4

共 50 条

[21] SEARCH ERROR RISK MINIMIZATION IN VITERBI BEAM SEARCH FOR SPEECH RECOGNITION
Hori, Takaaki
Watanabe, Shinji
Nakamura, Atsushi
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4934 - 4937
[22] Constraints on the recognition of words in continuous speech
McQueen, JM
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 39 - 39
[23] Reduced Memory Viterbi Decoding for Hardware-accelerated Speech Recognition
Raj, Pani Prithvi
Reddy, Pakala Akhil
Chandrachoodan, Nitin
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (03)
[24] Informative Spectro-Temporal Bottleneck Features for Noise-Robust Speech Recognition
Chang, Shuo-Yiin
Morgan, Nelson
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 99 - 103
[25] Analysis of Speech and Singing Signals for Temporal Alignment
Vijayan, Karthika
Gao, Xiaoxue
Li, Haizhou
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1893 - 1898
[26] ON TEMPORAL ALIGNMENT OF SENTENCES OF NATURAL AND SYNTHETIC SPEECH
HOHNE, HD
COKER, C
LEVINSON, SE
RABINER, LR
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1983, 31 (04): : 807 - 813
[27] Vector interpolation for time alignment in speech recognition
Yfantis, EA
Elison, JD
BOUNDARY ELEMENT TECHNOLOGY XIII: INCORPORATING COMPUTATIONAL METHODS AND TESTING FOR ENGINEERING INTEGRITY, 1999, 2 : 417 - 422
[28] Phonetic alignment:: speech synthesis-based vs. Viterbi-based
Malfrère, F
Deroo, O
Dutoit, T
Ris, C
SPEECH COMMUNICATION, 2003, 40 (04) : 503 - 515
[29] Adding Noise to Improve Noise Robustness in Speech Recognition
Morales, Nicolas
Gu, Liang
Gao, Yuqing
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 861 - +
[30] An improved noise compensation algorithm for speech recognition in noise
Yang, RK
Haavisto, P
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 49 - 52

← 1 2 3 4 5 →