g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset

被引:11
|
作者
Park, Kyubyong [1 ]
Lee, Seanie [2 ]
机构
[1] Kakao Brain, Seongnam, South Korea
[2] Korea Adv Inst Sci & Technol, Daejeon, South Korea
来源
关键词
Grapheme-to-phoneme conversion; Chinese polyphone disambiguation; text-to-speech; !text type='Python']Python[!/text] package;
D O I
10.21437/Interspeech.2020-1094
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Conversion of Chinese graphemes to phonemes (G2P) is an essential component in Mandarin Chinese Text-To-Speech (TTS) systems. One of the biggest challenges in Chinese G2P conversion is how to disambiguate the pronunciation of polyphones-characters having multiple pronunciations. Although many academic efforts have been made to address it, there has been no open dataset that can serve as a standard benchmark for a fair comparison to date. In addition, most of the reported systems are hard to employ for researchers or practitioners who want to convert Chinese text into pinyin at their convenience. Motivated by these, in this work, we introduce a new benchmark dataset that consists of 99,000+ sentences for Chinese polyphone disambiguation. We train a simple Bi-LSTM model on it and find that it outperforms other preexisting G2P systems and slightly underperforms pre-trained Chinese BERT. Finally, we package our project and share it on PyPi.
引用
收藏
页码:1723 / 1727
页数:5
相关论文
共 7 条
  • [1] Polyphone Disambiguation Based on Maximum Entropy Model in Mandarin Grapheme-to-Phoneme Conversion
    Liu, Fangzhou
    Zhou, You
    [J]. MATERIALS ENGINEERING FOR ADVANCED TECHNOLOGIES, PTS 1 AND 2, 2011, 480-481 : 1043 - +
  • [2] Grapheme-to-phoneme conversion based on a fast TBL algorithm in mandarin TTS systems
    Zheng, M
    Shi, Q
    Zhang, W
    Cai, LH
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 600 - 609
  • [3] New Grapheme Generation Rules for Two-Stage Model-based Grapheme-to-Phoneme Conversion
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2014, 8 (02) : 157 - 174
  • [4] Applying Linguistic G2P Knowledge on a Statistical Grapheme-to-phoneme Conversion in Khmer
    Sar, Vathnak
    Tan, Tien-Ping
    [J]. FIFTH INFORMATION SYSTEMS INTERNATIONAL CONFERENCE, 2019, 161 : 415 - 423
  • [5] Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach
    Kheang, Seng
    Katsurada, Kouichi
    Iribe, Yurie
    Nitta, Tsuneo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (04): : 901 - 910
  • [6] T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion
    Rezackova, Marketa
    Tihelka, Daniel
    Matousek, Jindrich
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3466 - 3476
  • [7] T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion
    Rezackova, Marketa
    Svec, Jan
    Tihelka, Daniel
    [J]. INTERSPEECH 2021, 2021, : 6 - 10