Using mutual information criterion to design an efficient phoneme set for Chinese speech recognition

被引：9

作者：

Zhang, Jin-Song ^{[1
,2
]}

Hu, Xin-Hui ^{[1
,2
,3
]}

Nakamura, Satoshi ^{[1
,2
]}

机构：

[1] Natl Inst Informat & Commun Technol, Knowledge Creating Commun Res Ctr, Spoken Language Commun Grp, Kyoto 6190288, Japan

[2] ATR Spoken Language Translat Commun Res Labs, Kyoto, Japan

[3] Res & Dev Ctr Toshiba, Kyoto, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2008年 / E91D卷 / 03期

关键词：

mutual information; Chinese lexical tones; tone dependent units; speech recognition;

D O I：

10.1093/ietisy/e91-d.3.508

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for dis-ambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.

引用

页码：508 / 513

页数：6

共 50 条

[1] Automatic derivation of a phoneme set with tone information for Chinese speech recognition based on mutual information criterion
Zhang, Jin-Song
Hu, Xin-Hui
Nakamura, Satohi
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 337 - 340
[2] Phoneme Set Design for Speech Recognition of English by Japanese
Wang, Xiaoyun
Zhang, Jinsong
Nishida, Masafumi
Yamamoto, Seiichi
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (01): : 148 - 156
[3] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
Khonglah, Banriskhem K.
Dey, Abhishek
Prasanna, S. R. Mahadeva
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (02) : 643 - 663
[4] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
Banriskhem K. Khonglah
Abhishek Dey
S. R. Mahadeva Prasanna
[J]. Circuits, Systems, and Signal Processing, 2019, 38 : 643 - 663
[5] Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion
Li, Wei
Zhang, Jinsong
Xie, Yanlu
Wang, Xiaoyun
Nishida, Masafumi
Yamamoto, Seiichi
[J]. 2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 269 - 272
[6] The Phoneme Set Influence for Lithuanian Speech Commands Recognition Accuracy
Greibus, Mindaugas
Ringeliene, Zivile
Telksnys, Laimutis
[J]. 2017 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2017,
[7] Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition
Wang, Xiaoyun
Kato, Tsuneo
Yamamoto, Seiichi
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (04): : 857 - 864
[8] Phoneme recognition using speech image (spectrogram)
Ahmadi, M
Bailey, NJ
Hoyle, BS
[J]. ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 675 - 677
[9] Approximated mutual information training for speech recognition using myoelectric signals
Guo, Hua J.
Chan, A. D. C.
[J]. 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 96 - 99
[10] Information divergence criterion in speech signal recognition
Bocharov, I
Lukin, P
[J]. FUNDAMENTA INFORMATICAE, 2005, 68 (04) : 303 - 313

← 1 2 3 4 5 →