CONNECTIONIST APPROACHES TO LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

被引：0

作者：

SAWAI, H

MINAMI, Y

MIYATAKE, M

WAIBEL, A

SHIKANO, K

机构：

来源：

IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS | 1991年 / 74卷 / 07期

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper describes recent progress in a connectionist large-vocabulary continuous speech recognition system integrating speech recognition and language processing. The speech recognition part consists of Large Phonemic Time-Delay Neural Networks (TDNNs) which can automatically spot all 24 Japanese phonemes (i.e., 18 consonants /b/, /d/, /g/, /p/, /t/, /k/, /m/, /n/, /N/, /s/, /sh/ ([integral]), /h/, /z/, /ch/ ([t-integral]), /ts/, /r/, /w/, /y/ ([j]) and 5 vowels /a/, /i/, /u/, /e/, /o/ and a double consonant /Q/ or silence) by simply scanning among input speech without any specific segmentation techniques. On the other hand, the language processing part is made up of a predictive LR parser in which the LR parser is guided by the LR parsing table automatically generated from context-free grammar rules, and proceeds left-to-right without backtracking. Time alignment between the predicted phonemes and a sequence of the TDNN phoneme outputs is carried out by the DTW matching method. We call this 'hybrid' integrated recognition system the 'TDNN-LR' method. We report that large-vocabulary isolated word and continuous speech recognition using the TDNN-LR method provided excellent speaker-dependent recognition performance, where incremental training using a small number of training tokens is found to be very effective for adaptation of speaking rate. Furthermore, we report some new achievements as extensions of the TDNN-LR method: (1) two proposed NN architectures provide robust phoneme recognition performance on variations of speaking manner, (2) a speaker-adaptation technique can be realized using a NN mapping function between input and standard speakers and (3) new architectures proposed for speaker-independent recognition provide performance that nearly matches speaker-dependent recognition performance.

引用

页码：1834 / 1844

页数：11

共 50 条

[1] Connectionist language modeling for large vocabulary continuous speech recognition
Schwenk, H
Gauvain, JL
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 765 - 768
[2] Boosting the performance of connectionist large vocabulary speech recognition
Cook, G
Robinson, T
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1305 - 1308
[3] Vietnamese Large Vocabulary Continuous Speech Recognition
Ngoc Thang Vu
Schultz, Tanja
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 333 - 338
[4] Advances in large vocabulary continuous speech recognition
Zweig, G
Picheny, M
[J]. ADVANCES IN COMPUTERS, VOL. 60: INFORMATION SECURITY, 2004, 60 : 249 - 291
[5] Confidence measures for large vocabulary continuous speech recognition
Wessel, F
Schlüter, R
Macherey, K
Ney, H
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 288 - 298
[6] Combating Reverberation in Large Vocabulary Continuous Speech Recognition
Mitra, Vikramjit
Van Hout, Julien
McLaren, Mitchell
Wang, Wen
Graciarena, Martin
Vergyri, Dimitra
Franco, Horacio
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2449 - 2453
[7] Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
Palecek, Karel
[J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 767 - 776
[8] The RWTH large vocabulary continuous speech recognition system
Ney, H
Welling, L
Ortmanns, S
Beulen, K
Wessel, F
[J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 853 - 856
[9] Developments in large vocabulary, continuous speech recognition of German
AddaDecker, M
Adda, G
Lamel, L
Gauvain, JL
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 153 - 156
[10] Accent Issues in Large Vocabulary Continuous Speech Recognition
Chao Huang
Tao Chen
Eric Chang
[J]. International Journal of Speech Technology, 2004, 7 (2-3) : 141 - 153

← 1 2 3 4 5 →