Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling

被引:3
|
作者
Saychum, Sittipong [1 ]
Kongyoung, Sarawoot [1 ]
Rugchatjaroen, Anocha [1 ]
Chootrakool, Patcharika [1 ]
Kasuriya, Sawit [1 ]
Wutiwiwatchai, Chai [1 ]
机构
[1] Natl Sci & Technol Dev Agcy, Natl Elect & Comp Technol Ctr, Pathum Thani, Thailand
关键词
grapheme-to-phoneme conversion; joint sequence modeling; Thai G2P; Conditional Random Fields;
D O I
10.21437/Interspeech.2016-621
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation and syllable type prediction. Graphemes and their corresponding phonemes are then aligned using well-designed many-to-many alignment rules and outputs given by the first CRF. The second CRF, modeling the jointly aligned sequences, efficiently predicts phonemes. The proposed method obviously improves the prediction of linking syllables, normally hidden from their textual graphemes. Evaluation results show that the prediction word error rate (WER) of the proposed method reaches 13.66%, which is 11.09% lower than that of the baseline system.
引用
收藏
页码:1462 / 1466
页数:5
相关论文
共 47 条
  • [21] Polyphone Disambiguation Based on Maximum Entropy Model in Mandarin Grapheme-to-Phoneme Conversion
    Liu, Fangzhou
    Zhou, You
    [J]. MATERIALS ENGINEERING FOR ADVANCED TECHNOLOGIES, PTS 1 AND 2, 2011, 480-481 : 1043 - +
  • [22] Novel Two-Stage Model for Grapheme-to-Phoneme Conversion using New Grapheme Generation Rules
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. 2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 97 - 102
  • [23] Grapheme-to-phoneme conversion based on a fast TBL algorithm in mandarin TTS systems
    Zheng, M
    Shi, Q
    Zhang, W
    Cai, LH
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 600 - 609
  • [24] LOW-RESOURCE GRAPHEME-TO-PHONEME CONVERSION USING RECURRENT NEURAL NETWORKS
    Jyothi, Preethi
    Hasegawa-Johnson, Mark
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5030 - 5034
  • [25] Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (04): : 901 - 910
  • [26] Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework
    Novak, Josef Robert
    Minematsu, Nobuaki
    Hirose, Keikichi
    [J]. NATURAL LANGUAGE ENGINEERING, 2016, 22 (06) : 907 - 938
  • [27] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ikbel Hadj Ali
    Zied Mnasri
    Zied Lachiri
    [J]. International Journal of Speech Technology, 2020, 23 : 569 - 584
  • [28] ACOUSTIC DATA-DRIVEN GRAPHEME-TO-PHONEME CONVERSION USING KL-HMM
    Rasipuram, Ramya
    Magimai-Doss, Mathew
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4841 - 4844
  • [29] ACOUSTIC DATA-DRIVEN GRAPHEME-TO-PHONEME CONVERSION USING KL-HMM
    Rasipuram, Ramya
    Magimai-Doss, Mathew
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4841 - 4844
  • [30] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ali, Ikbel Hadj
    Mnasri, Zied
    Lachiri, Zied
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 569 - 584