Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling

被引:3
|
作者
Saychum, Sittipong [1 ]
Kongyoung, Sarawoot [1 ]
Rugchatjaroen, Anocha [1 ]
Chootrakool, Patcharika [1 ]
Kasuriya, Sawit [1 ]
Wutiwiwatchai, Chai [1 ]
机构
[1] Natl Sci & Technol Dev Agcy, Natl Elect & Comp Technol Ctr, Pathum Thani, Thailand
关键词
grapheme-to-phoneme conversion; joint sequence modeling; Thai G2P; Conditional Random Fields;
D O I
10.21437/Interspeech.2016-621
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents the successful results of applying joint sequence modeling in Thai grapheme-to-phoneme conversion. The proposed method utilizes Conditional Random Fields (CRFs) in two-stage prediction. The first CRF is used for textual syllable segmentation and syllable type prediction. Graphemes and their corresponding phonemes are then aligned using well-designed many-to-many alignment rules and outputs given by the first CRF. The second CRF, modeling the jointly aligned sequences, efficiently predicts phonemes. The proposed method obviously improves the prediction of linking syllables, normally hidden from their textual graphemes. Evaluation results show that the prediction word error rate (WER) of the proposed method reaches 13.66%, which is 11.09% lower than that of the baseline system.
引用
收藏
页码:1462 / 1466
页数:5
相关论文
共 46 条
  • [1] Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion
    Rugchatjaroen, Anocha
    Saychum, Sittipong
    Kongyoung, Sarawoot
    Chootrakool, Patcharika
    Kasuriya, Sawit
    Wutiwiwatchai, Chai
    [J]. SPEECH COMMUNICATION, 2019, 106 : 105 - 111
  • [2] Joint-sequence models for grapheme-to-phoneme conversion
    Bisani, Maximilian
    Ney, Hermann
    [J]. SPEECH COMMUNICATION, 2008, 50 (05) : 434 - 451
  • [3] Example-Based Grapheme-to-Phoneme Conversion for Thai
    Charoenpornsawat, Paisarn
    Schultz, Tanja
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1268 - 1271
  • [4] BAYESIAN JOINT-SEQUENCE MODELS FOR GRAPHEME-TO-PHONEME CONVERSION
    Hannemann, Mirko
    Trmal, Jan
    Ondel, Lucas
    Kesiraju, Santosh
    Burget, Lukas
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2836 - 2840
  • [5] Grapheme-to-Phoneme Conversion for Thai using Neural Regression Models
    Yamasaki, Tomohiro
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4251 - 4255
  • [6] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERSPEECH 2019, 2019, : 2095 - 2099
  • [7] Multitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
    Milde, Benjamin
    Schmidt, Christoph
    Koehler, Joachim
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2536 - 2540
  • [8] Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion
    Yao, Kaisheng
    Zweig, Geoffrey
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3330 - 3334
  • [9] Arabic grapheme-to-phoneme conversion based on joint multi-gram model
    Cherifi, El-Hadi
    Guerti, Mhania
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 173 - 182
  • [10] JOINT ALIGNMENT LEARNING-ATTENTION BASED MODEL FOR GRAPHEME-TO-PHONEME CONVERSION
    Wang, Yonghe
    Bao, Feilong
    Zhang, Hui
    Gao, Guanglai
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7788 - 7792