A GRAMMAR COMPILER FOR CONNECTED SPEECH RECOGNITION

被引：4

作者：

BROWN, MK

WILPON, JG

机构：

[1] AT&T Bell Laboratories, NJ 07974, Murray Hill

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 1991年 / 39卷 / 01期

关键词：

D O I：

10.1109/78.80761

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

It is well known that syntactic constraints, when applied to speech recognition, greatly improve accuracy. However, until recently, constructing an efficient grammar specification for use by a connected word speech recognizer was performed by hand and has been a tedious, time-consuming task prone to error. For this reason, very large grammars have not appeared. We describe a compiler for constructing optimized syntactic digraphs from easily written grammar specifications. These are written in a language called grammar specification language (GSL). The compiler has a preprocessing (macroexpansion) phase, a parse phase, graph code generation and compilation phases, and three optimization phases. Digraphs can also be linked together by a graph linker to form larger diagraphs. Language complexity is analyzed in a statistics phase. Heretofore, computer generated digraphs were often filled with redundancies. Larger graphs were constructed and optimized by hand in order to achieve the required efficiency. We demonstrate that the optimization phase yields graphs with even greater efficiency than previously achieved by hand. We also discuss some preliminary speech recognition results of applying these techniques to intermediate and large graphs. With the introduction of these tools it is now possible to provide a speech recognition user with the ability to define new task grammars in the field. GSL has been used by several untutored users with good success. Experience with GSL indicates that it is a viable medium for quickly and accurately defining grammars for use in connected speech recognition systems.

引用

页码：17 / 28

页数：12

共 50 条

[41] Acoustic Modeling with Densely Connected Residual Network for Multichannel Speech Recognition
Tang, Jian
Song, Yan
Dai, LiRong
McLoughlin, Ian
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1783 - 1787
[42] Grammar-Supervised End-to-End Speech Recognition with Part-of-Speech Tagging and Dependency Parsing
Wan, Genshun
Mao, Tingzhi
Zhang, Jingxuan
Chen, Hang
Gao, Jianqing
Ye, Zhongfu
APPLIED SCIENCES-BASEL, 2023, 13 (07):
[43] SPEECH-COMPRESSION SIMULATION COMPILER
RADER, CM
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1965, 37 (06): : 1199 - &
[44] A flexible rule compiler for speech synthesis
Skut, W
Ulrich, S
Hammervold, K
INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 257 - 266
[45] The Implementation of a Vocabulary and Grammar for an Open-Source Speech-Recognition Programming Platform
Rodriguez-Cartagena, Jean K.
Claudio-Palacios, Andrea
Pacheco-Tallaj, Natalia
Santiago-Gonzalez, Valerie
Ordonez-Franco, Patricia
ASSETS'15: PROCEEDINGS OF THE 17TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS & ACCESSIBILITY, 2015, : 447 - 448
[46] Efficient use of the Grammar Scale Factor to classify incorrect words in speech recognition verification
Sanchis, A
Jiménez, V
Vidal, E
15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 274 - 277
[47] AFFECTIVE STRUCTURE MODELING OF SPEECH USING PROBABILISTIC CONTEXT FREE GRAMMAR FOR EMOTION RECOGNITION
Huang, Kun-Yi
Lin, Jia-Kuan
Chiu, Yu-Hsien
Wu, Chung-Hsien
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5286 - 5290
[48] Multilingual connected digits and natural numbers recognition in the telephone speech dialog systems
Imperl, Bojan
Kačič, Zdravko
Elektrotehniski Vestnik/Electrotechnical Review, 1999, 66 (03): : 214 - 221
[49] DSP real-time implementation of speech recognition based on connected digits
Lei, Chuanhua
Zhang, Xiubin
Sun, Jiyu
Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 1999, 33 (12): : 1525 - 1528
[50] Chinese Connected Word Speech Recognition Based on Derivative Dynamic Time Warping
He, Zhiguo
Liu, Zemin
AUTOMATIC MANUFACTURING SYSTEMS II, PTS 1 AND 2, 2012, 542-543 : 1324 - 1329

← 1 2 3 4 5 →