Using mutual information criterion to design an efficient phoneme set for Chinese speech recognition

被引:9
|
作者
Zhang, Jin-Song [1 ,2 ]
Hu, Xin-Hui [1 ,2 ,3 ]
Nakamura, Satoshi [1 ,2 ]
机构
[1] Natl Inst Informat & Commun Technol, Knowledge Creating Commun Res Ctr, Spoken Language Commun Grp, Kyoto 6190288, Japan
[2] ATR Spoken Language Translat Commun Res Labs, Kyoto, Japan
[3] Res & Dev Ctr Toshiba, Kyoto, Japan
来源
关键词
mutual information; Chinese lexical tones; tone dependent units; speech recognition;
D O I
10.1093/ietisy/e91-d.3.508
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for dis-ambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.
引用
收藏
页码:508 / 513
页数:6
相关论文
共 50 条
  • [1] Automatic derivation of a phoneme set with tone information for Chinese speech recognition based on mutual information criterion
    Zhang, Jin-Song
    Hu, Xin-Hui
    Nakamura, Satohi
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 337 - 340
  • [2] Phoneme Set Design for Speech Recognition of English by Japanese
    Wang, Xiaoyun
    Zhang, Jinsong
    Nishida, Masafumi
    Yamamoto, Seiichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (01): : 148 - 156
  • [3] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Khonglah, Banriskhem K.
    Dey, Abhishek
    Prasanna, S. R. Mahadeva
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (02) : 643 - 663
  • [4] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Banriskhem K. Khonglah
    Abhishek Dey
    S. R. Mahadeva Prasanna
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 643 - 663
  • [5] Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion
    Li, Wei
    Zhang, Jinsong
    Xie, Yanlu
    Wang, Xiaoyun
    Nishida, Masafumi
    Yamamoto, Seiichi
    [J]. 2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 269 - 272
  • [6] The Phoneme Set Influence for Lithuanian Speech Commands Recognition Accuracy
    Greibus, Mindaugas
    Ringeliene, Zivile
    Telksnys, Laimutis
    [J]. 2017 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2017,
  • [7] Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition
    Wang, Xiaoyun
    Kato, Tsuneo
    Yamamoto, Seiichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (04): : 857 - 864
  • [8] Phoneme recognition using speech image (spectrogram)
    Ahmadi, M
    Bailey, NJ
    Hoyle, BS
    [J]. ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 675 - 677
  • [9] Approximated mutual information training for speech recognition using myoelectric signals
    Guo, Hua J.
    Chan, A. D. C.
    [J]. 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 96 - 99
  • [10] Information divergence criterion in speech signal recognition
    Bocharov, I
    Lukin, P
    [J]. FUNDAMENTA INFORMATICAE, 2005, 68 (04) : 303 - 313