Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion

被引:0
|
作者
Yao, Kaisheng [1 ]
Zweig, Geoffrey [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
关键词
neural networks; grapheme-to-phoneme conversion; sequence-to-sequence neural networks;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sequence-to-sequence translation methods based on generation with a side-conditioned language model have recently shown promising results in several tasks. In machine translation, models conditioned on source side words have been used to produce target-language text, and in image captioning, models conditioned images have been used to generate caption text. Past work with this approach has focused on large vocabulary tasks, and measured quality in terms of BLEU. In this paper, we explore the applicability of such models to the qualitatively different grapheme-to-phoneme task. Here, the input and output side vocabularies are small, plain n-gram models do well, and credit is only given when the output is exactly correct. We find that the simple side-conditioned generation approach is able to rival the state-of-the-art, and we are able to significantly advance the stat-of-the-art with bi-directional long short-term memory (LSTM) neural networks that use the same alignment information that is used in conventional approaches.
引用
收藏
页码:3330 / 3334
页数:5
相关论文
共 50 条
  • [1] Multitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
    Milde, Benjamin
    Schmidt, Christoph
    Koehler, Joachim
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2536 - 2540
  • [2] Input Encoding for Sequence-to-Sequence Learning of Romanian Grapheme-to-Phoneme Conversion
    Stan, Adriana
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2019,
  • [3] Joint-sequence models for grapheme-to-phoneme conversion
    Bisani, Maximilian
    Ney, Hermann
    [J]. SPEECH COMMUNICATION, 2008, 50 (05) : 434 - 451
  • [4] BAYESIAN JOINT-SEQUENCE MODELS FOR GRAPHEME-TO-PHONEME CONVERSION
    Hannemann, Mirko
    Trmal, Jan
    Ondel, Lucas
    Kesiraju, Santosh
    Burget, Lukas
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2836 - 2840
  • [5] Sequence-to-Sequence Models for Grapheme to Phoneme Conversion on Large Myanmar Pronunciation Dictionary
    Hlaing, Aye Mya
    Pa, Win Pa
    [J]. 2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2019, : 149 - 153
  • [6] Dictionary Augmented Sequence-to-Sequence Neural Network for Grapheme to Phoneme prediction
    Bruguier, Antoine
    Bakhtin, Anton
    Sharma, Dravyansh
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3733 - 3737
  • [7] NEURAL GRAPHEME-TO-PHONEME CONVERSION WITH PRE-TRAINED GRAPHEME MODELS
    Dong, Lu
    Guo, Zhi-Qiang
    Tan, Chao-Hong
    Hu, Ya-Jun
    Jiang, Yuan
    Ling, Zhen-Hua
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6202 - 6206
  • [8] Grapheme-to-Phoneme Conversion for Thai using Neural Regression Models
    Yamasaki, Tomohiro
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4251 - 4255
  • [9] Grapheme-to-Phoneme Conversion with Convolutional Neural Networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [10] ANALYSIS OF SEQUENCE TO SEQUENCE NEURAL NETWORKS ON GRAPHEME TO PHONEME CONVERSION TASK
    Achanta, Sivanand
    Pandey, Ayushi
    Gangashetty, Suryakanth V.
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 2798 - 2804