Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling

被引:3
|
作者
Saychum, Sittipong [1 ]
Kongyoung, Sarawoot [1 ]
Rugchatjaroen, Anocha [1 ]
Chootrakool, Patcharika [1 ]
Kasuriya, Sawit [1 ]
Wutiwiwatchai, Chai [1 ]
机构
[1] Natl Sci & Technol Dev Agcy, Natl Elect & Comp Technol Ctr, Pathum Thani, Thailand
关键词
grapheme-to-phoneme conversion; joint sequence modeling; Thai G2P; Conditional Random Fields;
D O I
10.21437/Interspeech.2016-621
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation and syllable type prediction. Graphemes and their corresponding phonemes are then aligned using well-designed many-to-many alignment rules and outputs given by the first CRF. The second CRF, modeling the jointly aligned sequences, efficiently predicts phonemes. The proposed method obviously improves the prediction of linking syllables, normally hidden from their textual graphemes. Evaluation results show that the prediction word error rate (WER) of the proposed method reaches 13.66%, which is 11.09% lower than that of the baseline system.
引用
收藏
页码:1462 / 1466
页数:5
相关论文
共 46 条
  • [41] T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion
    Rezackova, Marketa
    Svec, Jan
    Tihelka, Daniel
    [J]. INTERSPEECH 2021, 2021, : 6 - 10
  • [42] Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier
    Gahbiche-Braham, Souhir
    Bonneau-Maynard, Helene
    Lavergne, Thomas
    Yvon, Francois
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2107 - 2113
  • [43] A data-driven grapheme-to-phoneme conversion method using dynamic contextual converting rules for Korean TTS systems
    Lee, Jinsik
    Lee, Gary Geunbae
    [J]. COMPUTER SPEECH AND LANGUAGE, 2009, 23 (04): : 423 - 434
  • [44] g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset
    Park, Kyubyong
    Lee, Seanie
    [J]. INTERSPEECH 2020, 2020, : 1723 - 1727
  • [45] Memory-based Data-driven Approach for Grapheme-to-Phoneme Conversion in Bengali Text-to-Speech Synthesis System
    Ghosh, Krishnendu
    Rao, K. Sreenivasa
    [J]. 2011 ANNUAL IEEE INDIA CONFERENCE (INDICON-2011): ENGINEERING SUSTAINABLE SOLUTIONS, 2011,
  • [46] Comprehensive Phonological Analysis for Clinical Implication using Self-Attention based Grapheme to Phoneme modeling under low-resource conditions
    Bawa, Puneet
    Kadyan, Virender
    Singh, Muskaan
    [J]. 2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS, 2023,