Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion

被引:8
|
作者
Rugchatjaroen, Anocha [1 ]
Saychum, Sittipong [1 ]
Kongyoung, Sarawoot [1 ]
Chootrakool, Patcharika [1 ]
Kasuriya, Sawit [1 ]
Wutiwiwatchai, Chai [1 ]
机构
[1] Natl Sci & Technol Dev Agcy, Natl Elect & Comp Technol Ctr, Khlong Luang, Thailand
关键词
Grapheme to phoneme conversion; Thai; Joint sequence modelling; Thai pseudo-syllable;
D O I
10.1016/j.specom.2018.12.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Thai grapheme-to-phoneme conversion (G2P) is a challenging task due to its difficulties found by many previous studies. This paper introduces a novel two-stage processing for Thai G2P. The first stage uses Conditional Random Fields (CRF) to segment input text into pseudo-syllable (PS) units, the smallest unit with pronunciation inconfusable. This first CRF simultaneously segments input text into PS units and predicts the function of each character. Outputs from the first stage are used to efficiently align graphemes and phonemes, forming graphone joint sequences as input for the next stage. The second stage uses another CRF to model the graphone joint sequences. The character function predicted by the first stage is the cue to explicitly solve some critical Thai G2P difficulties such as hidden syllables often appeared in loan words and complicated character ordering. An evaluation is done using a large pronunciation dictionary that covers over 70% of Thai word usage. Experimental results show that 6.55% and 8.43% word error rates (WER) are obtained at the first and the second prediction states, while the overall G2P achieves a 9.94% WER. This is as much as 14.49% absolute improvement from a baseline model using Context Free Grammar (CFG) syllabification and syllable n-gram modeling.
引用
收藏
页码:105 / 111
页数:7
相关论文
共 50 条
  • [1] New Grapheme Generation Rules for Two-Stage Model-based Grapheme-to-Phoneme Conversion
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2014, 8 (02) : 157 - 174
  • [2] Efficient Thai Grapheme-to-Phoneme Conversion Using CRF-Based Joint Sequence Modeling
    Saychum, Sittipong
    Kongyoung, Sarawoot
    Rugchatjaroen, Anocha
    Chootrakool, Patcharika
    Kasuriya, Sawit
    Wutiwiwatchai, Chai
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1462 - 1466
  • [3] Novel Two-Stage Model for Grapheme-to-Phoneme Conversion using New Grapheme Generation Rules
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. 2014 INTERNATIONAL CONFERENCE OF ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA), 2014, : 97 - 102
  • [4] Joint-sequence models for grapheme-to-phoneme conversion
    Bisani, Maximilian
    Ney, Hermann
    [J]. SPEECH COMMUNICATION, 2008, 50 (05) : 434 - 451
  • [5] Example-Based Grapheme-to-Phoneme Conversion for Thai
    Charoenpornsawat, Paisarn
    Schultz, Tanja
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1268 - 1271
  • [6] BAYESIAN JOINT-SEQUENCE MODELS FOR GRAPHEME-TO-PHONEME CONVERSION
    Hannemann, Mirko
    Trmal, Jan
    Ondel, Lucas
    Kesiraju, Santosh
    Burget, Lukas
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2836 - 2840
  • [7] Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (04): : 901 - 910
  • [8] Arabic grapheme-to-phoneme conversion based on joint multi-gram model
    Cherifi, El-Hadi
    Guerti, Mhania
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 173 - 182
  • [9] JOINT ALIGNMENT LEARNING-ATTENTION BASED MODEL FOR GRAPHEME-TO-PHONEME CONVERSION
    Wang, Yonghe
    Bao, Feilong
    Zhang, Hui
    Gao, Guanglai
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7788 - 7792
  • [10] Arabic grapheme-to-phoneme conversion based on joint multi-gram model
    El-Hadi Cherifi
    Mhania Guerti
    [J]. International Journal of Speech Technology, 2021, 24 : 173 - 182