Efficient two-stage processing for joint sequence model-based Thai grapheme-to-phoneme conversion

被引:8
|
作者
Rugchatjaroen, Anocha [1 ]
Saychum, Sittipong [1 ]
Kongyoung, Sarawoot [1 ]
Chootrakool, Patcharika [1 ]
Kasuriya, Sawit [1 ]
Wutiwiwatchai, Chai [1 ]
机构
[1] Natl Sci & Technol Dev Agcy, Natl Elect & Comp Technol Ctr, Khlong Luang, Thailand
关键词
Grapheme to phoneme conversion; Thai; Joint sequence modelling; Thai pseudo-syllable;
D O I
10.1016/j.specom.2018.12.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Thai grapheme-to-phoneme conversion (G2P) is a challenging task due to its difficulties found by many previous studies. This paper introduces a novel two-stage processing for Thai G2P. The first stage uses Conditional Random Fields (CRF) to segment input text into pseudo-syllable (PS) units, the smallest unit with pronunciation inconfusable. This first CRF simultaneously segments input text into PS units and predicts the function of each character. Outputs from the first stage are used to efficiently align graphemes and phonemes, forming graphone joint sequences as input for the next stage. The second stage uses another CRF to model the graphone joint sequences. The character function predicted by the first stage is the cue to explicitly solve some critical Thai G2P difficulties such as hidden syllables often appeared in loan words and complicated character ordering. An evaluation is done using a large pronunciation dictionary that covers over 70% of Thai word usage. Experimental results show that 6.55% and 8.43% word error rates (WER) are obtained at the first and the second prediction states, while the overall G2P achieves a 9.94% WER. This is as much as 14.49% absolute improvement from a baseline model using Context Free Grammar (CFG) syllabification and syllable n-gram modeling.
引用
收藏
页码:105 / 111
页数:7
相关论文
共 50 条
  • [41] Two-stage 3D model-based UAV pose estimation: A comparison of methods for optimization
    Santos, Nuno Pessanha
    Lobo, Victor
    Bernardino, Alexandre
    [J]. JOURNAL OF FIELD ROBOTICS, 2020, 37 (04) : 580 - 605
  • [42] Monitoring two-stage processes with binomial data using generalized linear model-based control charts
    Amiri, Amirhossein
    Yeh, Arthur B.
    Asgari, Ali
    [J]. QUALITY TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2016, 13 (03): : 241 - 262
  • [43] Two-stage iris recognition model with continuous feature space based on image texture processing
    Liu, Shuai
    Liu, Yuanning
    Zhu, Xiaodong
    Cui, Jingwei
    Zhou, Zhiyong
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
  • [44] A Two-Stage Generative Model with CycleGAN and Joint Diffusion for MRI-based Brain Tumor Detection
    Wang, Wenxin
    Cui, Zhuo-Xu
    Cheng, Guanxun
    Cao, Chentao
    Xu, Xi
    Liu, Ziwei
    Wang, Haifeng
    Qi, Yulong
    Liang, Dong
    Zhu, Yanjie
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (06) : 3534 - 3544
  • [45] Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response
    Yuan, Ying
    Little, Roderick J. A.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2007, 56 : 79 - 97
  • [46] Parametric and semiparametric model-based estimates of the finite population mean for two-stage cluster samples with item nonresponse
    Yuan, Ying
    Little, Roderick J. A.
    [J]. BIOMETRICS, 2007, 63 (04) : 1172 - 1180
  • [47] General Two-Stage Model-Based Three-Component Hybrid Compact Polarimetric SAR Decomposition Method
    Hou, Wentao
    Zhao, Fengjun
    Liu, Xiuqing
    Wang, Robert
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 4647 - 4660
  • [48] Two-stage model-based clinical trial design to optimize phase I development of novel anticancer agents
    Zandvliet, Anthe S.
    Karlsson, Mats O.
    Schellens, Jan H. M.
    Copalu, William
    Beijnen, Jos H.
    Huitema, Alwin D. R.
    [J]. INVESTIGATIONAL NEW DRUGS, 2010, 28 (01) : 61 - 75
  • [49] Two-stage model-based clinical trial design to optimize phase I development of novel anticancer agents
    Anthe S. Zandvliet
    Mats O. Karlsson
    Jan H. M. Schellens
    William Copalu
    Jos H. Beijnen
    Alwin D. R. Huitema
    [J]. Investigational New Drugs, 2010, 28 : 61 - 75
  • [50] Identification of Interacting Genes in Genome-Wide Association Studies Using a Model-Based Two-Stage Approach
    Zhang, Zhaogong
    Niu, Adan
    Sha, Qiuying
    [J]. ANNALS OF HUMAN GENETICS, 2010, 74 : 406 - 415