JOINT ALIGNMENT LEARNING-ATTENTION BASED MODEL FOR GRAPHEME-TO-PHONEME CONVERSION

被引:2
|
作者
Wang, Yonghe
Bao, Feilong [1 ]
Zhang, Hui
Gao, Guanglai
机构
[1] Inner Mongolia Univ, Coll Comp Sci, Hohhot, Peoples R China
关键词
grapheme-to-phoneme conversion; alignment learning; attention; multitask learning; SEQUENCE MODELS;
D O I
10.1109/ICASSP39728.2021.9413679
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sequence-to-sequence attention-based models for grapheme-to-phoneme (G2P) conversion have gained significant interests. The attention-based encoder-decoder framework learns the mapping of input to output tokens by selectively focusing on relevant information, and has been shown well performance. However, the attention mechanism can result in non-monotonic alignments, resulting in poor G2P conversion performance. In this paper, we present a novel approach to optimize the G2P conversion model directly alignment grapheme-phoneme sequence by using alignment learning (AL) as the loss function. Besides, we propose a multi-task learning method that uses a joint alignment learning model and attention model to predict the proper alignments and thus improve the accuracy of G2P conversion. Evaluations on Mongolian and CMUDict tasks show that alignment learning as the loss function can effectively train G2P conversion model. Further, our multi-task method can significantly outperform both the alignment learning-based model and attention-based model.
引用
收藏
页码:7788 / 7792
页数:5
相关论文
共 50 条
  • [1] Arabic grapheme-to-phoneme conversion based on joint multi-gram model
    Cherifi, El-Hadi
    Guerti, Mhania
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) : 173 - 182
  • [2] Arabic grapheme-to-phoneme conversion based on joint multi-gram model
    El-Hadi Cherifi
    Mhania Guerti
    [J]. International Journal of Speech Technology, 2021, 24 : 173 - 182
  • [3] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERSPEECH 2019, 2019, : 2095 - 2099
  • [4] Learning from Errors in Grapheme-to-Phoneme Conversion
    Polyakova, Tatyana
    Bonafonte, Antonio
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2442 - 2445
  • [5] Joint-sequence models for grapheme-to-phoneme conversion
    Bisani, Maximilian
    Ney, Hermann
    [J]. SPEECH COMMUNICATION, 2008, 50 (05) : 434 - 451
  • [6] Grapheme-to-Phoneme Conversion with a Multilingual Transformer Model
    ElSaadany, Omnia
    Suter, Benjamin
    [J]. 17TH SIGMORPHON WORKSHOP ON COMPUTATIONAL RESEARCH IN PHONETICS PHONOLOGY, AND MORPHOLOGY (SIGMORPHON 2020), 2020, : 85 - 89
  • [7] Iterative Grapheme-to-Phoneme Alignment for the Training of WFST-based Phonetic Conversion
    Bohac, Marek
    Malek, Jiri
    Blavka, Karel
    [J]. 2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 474 - 478
  • [8] Incorporating syllabification points into a model of grapheme-to-phoneme conversion
    Suyanto Suyanto
    [J]. International Journal of Speech Technology, 2019, 22 : 459 - 470
  • [9] Fast Bilingual Grapheme-To-Phoneme Conversion
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    [J]. 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 289 - 296
  • [10] Incorporating syllabification points into a model of grapheme-to-phoneme conversion
    Suyanto, Suyanto
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (02) : 459 - 470