T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion

被引:8
|
作者
Rezackova, Marketa [1 ]
Svec, Jan
Tihelka, Daniel
机构
[1] Univ West Bohemia, New Technol Informat Soc, Fac Appl Sci, Plzen, Czech Republic
来源
关键词
grapheme-to-phoneme; phonetic transcription; T5; transformers; TTS system; SYSTEM;
D O I
10.21437/Interspeech.2021-546
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Despite the increasing popularity of end-to-end text-to-speech (TTS) systems, the correct grapheme-to-phoneme (G2P) module is still a crucial part of those relying on a phonetic input. In this paper, we, therefore, introduce a T5G2P model, a Text-to-Text Transfer Transformer (T5) neural network model which is able to convert an input text sentence into a phoneme sequence with a high accuracy. The evaluation of our trained T5 model is carried out on English and Czech, since there are different specific properties of G2P, including homograph disambiguation, cross-word assimilation and irregular pronunciation of loanwords. The paper also contains an analysis of a homographs issue in English and offers another approach to Czech phonetic transcription using the detection of pronunciation exceptions.
引用
收藏
页码:6 / 10
页数:5
相关论文
共 20 条
  • [1] T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion
    Rezackova, Marketa
    Tihelka, Daniel
    Matousek, Jindrich
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3466 - 3476
  • [2] Automatic Grapheme-to-Phoneme Conversion of Arabic Text
    Al-Daradkah, Belal
    Al-Diri, Bashir
    [J]. 2015 SCIENCE AND INFORMATION CONFERENCE (SAI), 2015, : 468 - 473
  • [3] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ikbel Hadj Ali
    Zied Mnasri
    Zied Lachiri
    [J]. International Journal of Speech Technology, 2020, 23 : 569 - 584
  • [4] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ali, Ikbel Hadj
    Mnasri, Zied
    Lachiri, Zied
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 569 - 584
  • [5] A unified approach to grapheme-to-phoneme conversion for the PLATTOS Slovenian text-to-speech system
    Rojc, Matej
    Kacic, Zdravko
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2007, 21 (06) : 563 - 603
  • [6] HaT5: Hate Language Identification using Text-to-Text Transfer Transformer
    Sabry, Sana Sabah
    Adewumi, Tosin
    Abid, Nosheen
    Kovacs, Gyorgy
    Liwicki, Foteini
    Liwicki, Marcus
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [7] Text-to-Text Transfer Transformer Phrasing Model Using Enriched Text Input
    Rezackova, Marketa
    Matousek, Jindrich
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 389 - 400
  • [8] ERROR DETECTION OF GRAPHEME-TO-PHONEME CONVERSION IN TEXT-TO-SPEECH SYNTHESIS USING SPEECH SIGNAL AND LEXICAL CONTEXT
    Vythelingum, Kevin
    Esteve, Yannick
    Rosec, Olivier
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 692 - 697
  • [9] Applying Linguistic G2P Knowledge on a Statistical Grapheme-to-phoneme Conversion in Khmer
    Sar, Vathnak
    Tan, Tien-Ping
    [J]. FIFTH INFORMATION SYSTEMS INTERNATIONAL CONFERENCE, 2019, 161 : 415 - 423
  • [10] Grapheme-to-Phoneme Conversion based on High-order Markov Chain for Spoken Term Detection by text query
    Prozorov, Dmitriy
    Tatarinova, Alexandra
    [J]. 2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,