T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion

被引:0
|
作者
Rezackova, Marketa [1 ,2 ]
Tihelka, Daniel [2 ]
Matousek, Jindrich [1 ,2 ]
机构
[1] Univ West Bohemia, Fac Appl Sci, Dept Cybernet, Plzen 30100, Czech Republic
[2] Univ West Bohemia, Fac Appl Sci, New Technol Informat Soc, Plzen 30100, Czech Republic
关键词
Phonetics; Dictionaries; Stress; Accuracy; Transformers; Training; Task analysis; CNN; Czech; English; G2P; German; phonetic transcription; RNN; Russian; T5;
D O I
10.1109/TASLP.2024.3426332
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The present paper explores the use of several deep neural network architectures to carry out a grapheme-to-phoneme (G2P) conversion, aiming to find a universal and language-independent approach to the task. The models explored are trained on whole sentences in order to automatically capture cross-word context (such as voicedness assimilation) if it exists in the given language. Four different languages, English, Czech, Russian, and German, were chosen due to their different nature and requirements for the G2P task. Ultimately, the Text-to-Text Transfer Transformer (T5) based model achieved very high conversion accuracy on all the tested languages. Also, it exceeded the accuracy reached by a similar system, when trained on a public LibriSpeech database.
引用
收藏
页码:3466 / 3476
页数:11
相关论文
共 21 条
  • [1] T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion
    Rezackova, Marketa
    Svec, Jan
    Tihelka, Daniel
    [J]. INTERSPEECH 2021, 2021, : 6 - 10
  • [2] Automatic Grapheme-to-Phoneme Conversion of Arabic Text
    Al-Daradkah, Belal
    Al-Diri, Bashir
    [J]. 2015 SCIENCE AND INFORMATION CONFERENCE (SAI), 2015, : 468 - 473
  • [3] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERSPEECH 2019, 2019, : 2095 - 2099
  • [4] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ikbel Hadj Ali
    Zied Mnasri
    Zied Lachiri
    [J]. International Journal of Speech Technology, 2020, 23 : 569 - 584
  • [5] DNN-based grapheme-to-phoneme conversion for Arabic text-to-speech synthesis
    Ali, Ikbel Hadj
    Mnasri, Zied
    Lachiri, Zied
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 569 - 584
  • [6] A unified approach to grapheme-to-phoneme conversion for the PLATTOS Slovenian text-to-speech system
    Rojc, Matej
    Kacic, Zdravko
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2007, 21 (06) : 563 - 603
  • [7] Grapheme-to-Phoneme Conversion based on High-order Markov Chain for Spoken Term Detection by text query
    Prozorov, Dmitriy
    Tatarinova, Alexandra
    [J]. 2017 IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS), 2017,
  • [8] Applying Linguistic G2P Knowledge on a Statistical Grapheme-to-phoneme Conversion in Khmer
    Sar, Vathnak
    Tan, Tien-Ping
    [J]. FIFTH INFORMATION SYSTEMS INTERNATIONAL CONFERENCE, 2019, 161 : 415 - 423
  • [9] Ensemble-NQG-T5: Ensemble Neural Question Generation Model Based on Text-to-Text Transfer Transformer
    Hwang, Myeong-Ha
    Shin, Jikang
    Seo, Hojin
    Im, Jeong-Seon
    Cho, Hee
    Lee, Chun-Kwon
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (02):
  • [10] HaT5: Hate Language Identification using Text-to-Text Transfer Transformer
    Sabry, Sana Sabah
    Adewumi, Tosin
    Abid, Nosheen
    Kovacs, Gyorgy
    Liwicki, Foteini
    Liwicki, Marcus
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,