Comparison of syllable-based and phoneme-based DNN-HMM in Japanese Speech Recognition

被引:0
|
作者
Seki, Hiroshi [1 ]
Yamamoto, Kazumasa [1 ]
Nakagawa, Seiichi [1 ]
机构
[1] Toyohashi Univ Technol, Dept Comp Sci & Engn, Toyohashi, Aichi, Japan
关键词
syllable; phoneme; GMM-HMM; deep neural network; DNN-HMM; speech recognition; DEEP; ADAPTATION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Japanese is syllabic language. Additionally we have studied syllable-based GMM-HMM for Japanese speech recognition. In this paper, we investigate the differences of recognition accuracy using phoneme/syllable-based GMM-HMM and DNN (Deep Neural Network)-HMM. First, we present a comparison of syllable-based and phoneme-based DNN-HMM. Second, we train the tied state left-context dependent syllable DNN-HMM, and compare these three types of modeling method. In the experiment, we obtained a 5% relative gain for WER using left-context syllable DNN-HMM in comparison with a left-context syllable GMM-HMM, and an 11% relative gain for WER using triphone DNN-HMM in comparison with a syllable-based DNN-HMM. Finally, we got results that modeling left-context phoneme has not worked and context independent syllable-based DNN-HMM got the best performance in the experiments, when applied to the ASJ+JNAS corpus, which consists of about 70 hours.
引用
收藏
页码:249 / 254
页数:6
相关论文
共 50 条
  • [1] Syllable based DNN-HMM Cantonese Speech-to-Text System
    Wong, Timothy
    Li, Claire W. Y.
    Lam, Sam
    Chiu, Billy
    Lu, Qin
    Li, Minglei
    Xiong, Dan
    Yu, Roy S.
    Ng, Vincent T. Y.
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3856 - 3862
  • [2] DNN-HMM based Automatic Speech Recognition for HRI Scenarios
    Novoa, Jose
    Wuth, Jorge
    Pablo Escudero, Juan
    Fredes, Josue
    Mahu, Rodrigo
    Becerra Yoma, Nestor
    [J]. HRI '18: PROCEEDINGS OF THE 2018 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2018, : 150 - 159
  • [3] Comparison of DCT and Autoencoder-based Features for DNN-HMM Multimodal Silent Speech Recognition
    Liu, Licheng
    Ji, Yan
    Wang, Hongcui
    Denby, Bruce
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [4] Research on Speech Accurate Recognition Technology Based on Deep Learning DNN-HMM
    Xia Wanyu
    Qiu Wu
    Feng Xiancheng
    [J]. MIPPR 2019: PATTERN RECOGNITION AND COMPUTER VISION, 2020, 11430
  • [5] Phoneme-based vector quantization in a discrete HMM speech recognizer
    Zhang, YX
    Togneri, R
    Alder, M
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (01): : 26 - 32
  • [6] Improved Phoneme-Based Myoelectric Speech Recognition
    Zhou, Quan
    Jiang, Ning
    Englehart, Kevin
    Hudgins, Bernard
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2009, 56 (08) : 2016 - 2023
  • [7] Syllable-Based Speech Recognition Using EMG
    Lopez-Larraz, Eduardo
    Mozos, Oscar M.
    Antelis, Javier M.
    Minguez, Javier
    [J]. 2010 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2010, : 4699 - 4702
  • [8] Syllable-based automatic Arabic speech recognition
    Azmi, Mohamed Mostafa
    Tolba, Hesham
    Mahdy, Sherif
    Fashal, Mervat
    [J]. PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, ROBOTICS AND AUTOMATION: ADVANCED TOPICS ON SIGNAL PROCESSING, ROBOTICS AND AUTOMATION, 2008, : 246 - +
  • [9] Automatic syllable-based phoneme recognition using ESTER corpus
    Le Blouch, Olivier
    Collen, Patrice
    [J]. PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTATIONAL GEOMETRY AND ARTIFICIAL VISION (ISCGAV'-07), 2007, : 77 - +
  • [10] Myoclectric signal classification for phoneme-based speech recognition
    Scheme, Erik J.
    Hudgins, Bernard
    Parker, Phillip A.
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2007, 54 (04) : 694 - 699