Automatic derivation of a phoneme set with tone information for Chinese speech recognition based on mutual information criterion

被引:0
|
作者
Zhang, Jin-Song [1 ]
Hu, Xin-Hui [1 ]
Nakamura, Satohi [1 ]
机构
[1] ATR, Spoken Language Commun Res Labs, Kyoto 6190288, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An appropriate approach to model tone information is helpful for building Chinese large vocabulary continuous speech recognition system. We propose to derive an efficient phoneme set of tone-dependent sub-word units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the mutual information. The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has the capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enable a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed their effectiveness.
引用
收藏
页码:337 / 340
页数:4
相关论文
共 50 条
  • [1] Using mutual information criterion to design an efficient phoneme set for Chinese speech recognition
    Zhang, Jin-Song
    Hu, Xin-Hui
    Nakamura, Satoshi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (03): : 508 - 513
  • [2] Emotional feature extraction based on phoneme information for speech emotion recognition
    Hyun, Kyang Hak
    Kim, Eun Ho
    Kwak, Yoon Keun
    [J]. 2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 797 - +
  • [3] Automatic Chinese Text Categorization System Based on Mutual Information
    Lu, Zhimao
    Shi, Hong
    Zhang, Qi
    Yuan, Chaoyue
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-7, CONFERENCE PROCEEDINGS, 2009, : 4986 - 4990
  • [4] MLP BASED PHONEME DETECTORS FOR AUTOMATIC SPEECH RECOGNITION
    Thomas, Samuel
    Patrick Nguyen
    Zweig, Geoffrey
    Hermansky, Hynek
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5024 - 5027
  • [5] Information divergence criterion in speech signal recognition
    Bocharov, I
    Lukin, P
    [J]. FUNDAMENTA INFORMATICAE, 2005, 68 (04) : 303 - 313
  • [6] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Khonglah, Banriskhem K.
    Dey, Abhishek
    Prasanna, S. R. Mahadeva
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (02) : 643 - 663
  • [7] Minimum of Information Divergence Criterion for Signals with Tuning to Speaker Voice in Automatic Speech Recognition
    Savchenko V.V.
    [J]. Radioelectronics and Communications Systems, 2020, 63 (01) : 42 - 54
  • [8] Speech Enhancement Using Source Information for Phoneme Recognition of Speech with Background Music
    Banriskhem K. Khonglah
    Abhishek Dey
    S. R. Mahadeva Prasanna
    [J]. Circuits, Systems, and Signal Processing, 2019, 38 : 643 - 663
  • [9] Combination of improved Katz and mutual information for speech recognition based on Lattice
    Zhang Lei
    Lu Dong
    Xiang Xue-zhi
    [J]. 2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 6379 - 6382
  • [10] A multilingual phoneme and model set: Toward a universal base for automatic speech recognition
    Gokeen, S
    Gokeen, J
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 599 - 605