Template-based continuous speech recognition

被引：93

作者：

De Wachter, Mathias ^{[1
]}

Matton, Mike

Demuynck, Kris

Wambacq, Patrick

Cools, Ronald

Van Compernolle, Dirk

机构：

[1] Katholieke Univ Leuven, Elect Engn Dept ESAT, Speech Proc Res Grp, B-3000 Louvain, Belgium

[2] Katholieke Univ Leuven, Dept Comp Sci, NINES Grp, B-3000 Louvain, Belgium

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 04期

关键词：

dynamic time warping (DTW); episodic modeling; example-based recognition;

D O I：

10.1109/TASL.2007.894524

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Despite their known weaknesses, hidden Markov models (HMMs) have been the dominant technique for acoustic modeling in speech recognition for over two decades. Still, the advances in the HMM framework have not solved its key problems: it discards information about time dependencies and is prone to overgeneralization. In this paper, we attempt to overcome these problems by relying on straightforward template matching. The basis for the recognizer is the well-known DTW algorithm. However, classical DTW continuous speech recognition results in an explosion of the search space. The traditional top-down search is therefore complemented with a data-driven selection of candidates for DTW alignment. We also extend the DTW framework with a flexible subword unit mechanism and a class sensitive distance measure-two components suggested by state-of-the-art HMM systems. The added flexibility of the unit selection in the template-based framework leads to new approaches to speaker and environment adaptation. The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate with 17% compared to the HMM results.

引用

页码：1377 / 1390

页数：14

共 50 条

[1] Template-based Automatic Speech Recognition meets Prosody
Seppi, Dino
Demuynck, Kris
Van Compernolle, Dirk
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 552 - 555
[2] Data Pruning for Template-based Automatic Speech Recognition
Seppi, Dino
Van Compernolle, Dirk
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 901 - 904
[3] CMOS PROCESSOR FOR TEMPLATE-BASED SPEECH-RECOGNITION SYSTEM
DREWS, W
LAROIA, R
PANDEL, J
SCHUMACHER, A
STOLZLE, A
[J]. IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION, 1989, 136 (02): : 155 - 161
[4] Using pitch as prior knowledge in template-based speech recognition
Aradilla, Guillermo
Vepa, Jithendra
Bourlard, Herve
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 445 - 448
[5] Template-based Spectral Estimation Using Microphone Array for Speech Recognition
Tamura, Satoshi
Hishikawa, Eriko
Taguchi, Wataru
Hayamizu, Satoru
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2050 - +
[6] Face recognition is not template-based
Carbon, CC
Leder, H
[J]. PERCEPTION, 2004, 33 : 103 - 103
[7] Template-based automatic recognition of birdsong syllables from continuous recordings
Anderson, SE
Dave, AS
Margoliash, D
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (02): : 1209 - 1219
[8] Template-based online character recognition
Connell, SD
Jain, AK
[J]. PATTERN RECOGNITION, 2001, 34 (01) : 1 - 14
[9] Probabilistic Template-Based Chord Recognition
Oudre, Laurent
Fevotte, Cedric
Grenier, Yves
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2249 - 2259
[10] Isolated Tamil Digit Speech Recognition Using Template-Based and HMM-Based Approaches
Karpagavalli, S.
Deepika, R.
Kokila, P.
Rani, K. Usha
Chandra, E.
[J]. GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 441 - +

← 1 2 3 4 5 →