IMPROVING GRAPHEME-TO-PHONEME CONVERSION BY INVESTIGATING COPYING MECHANISM IN RECURRENT ARCHITECTURES

被引:0
|
作者
Niranjan, Abhishek [1 ]
Shaik, M. Ali Basha [1 ]
机构
[1] Samsung Res & Dev Inst, Voice Intelligence, Bangalore, Karnataka, India
关键词
Grapheme-to-Phoneme; Copy augmentation; encoder-decoder; attention;
D O I
10.1109/asru46091.2019.9003729
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attention driven encoder-decoder architectures have become highly successful in various sequence-to-sequence learning tasks. We propose copy-augmented Bi-directional Long Short-Term Memory based Encoder-Decoder architecture for the Grapheme-to-Phoneme conversion. In Grapheme-to-Phoneme task, a number of character units in words possess high degree of similarity with some phoneme unit(s). Thus, we make an attempt to capture this characteristic using copy-augmented architecture. Our proposed model automatically learns to generate phoneme sequences during inference by copying source token embeddings to the decoder's output in a controlled manner. To our knowledge, this is the first time the copy-augmentation is being investigated for Grapheme-to-Phoneme conversion task. We validate our experiments over accented and non-accented publicly available CMU-Dict datasets and achieve State-of-The-Art performances in terms of both phoneme and word error rates. Further, we verify the applicability of our proposed approach on Hindi Lexicon and show that our model outperforms all recent State-of-The-Art results.
引用
收藏
页码:442 / 448
页数:7
相关论文
共 50 条
  • [1] Fast Bilingual Grapheme-To-Phoneme Conversion
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    [J]. 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 289 - 296
  • [2] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERSPEECH 2019, 2019, : 2095 - 2099
  • [3] Improving LVCSR with Hidden Conditional Random Fields for Grapheme-to-Phoneme Conversion
    Hahn, Stefan
    Lehnen, Patrick
    Wiesler, Simon
    Schlueter, Ralf
    Ney, Hermann
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 495 - 499
  • [4] Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
    Prabhu, Nikhil
    Kann, Katharina
    [J]. 17TH SIGMORPHON WORKSHOP ON COMPUTATIONAL RESEARCH IN PHONETICS PHONOLOGY, AND MORPHOLOGY (SIGMORPHON 2020), 2020, : 123 - 127
  • [5] Grapheme-to-Phoneme Conversion with Convolutional Neural Networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [6] Grapheme-to-phoneme conversion in Chinese TTS system
    Dong, HH
    Tao, JH
    Xu, B
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 165 - 168
  • [7] Label Embedding for Chinese Grapheme-to-Phoneme Conversion
    Choi, Eunbi
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    [J]. INTERSPEECH 2021, 2021, : 4094 - 4098
  • [8] Automatic Grapheme-to-Phoneme Conversion of Arabic Text
    Al-Daradkah, Belal
    Al-Diri, Bashir
    [J]. 2015 SCIENCE AND INFORMATION CONFERENCE (SAI), 2015, : 468 - 473
  • [9] Learning from Errors in Grapheme-to-Phoneme Conversion
    Polyakova, Tatyana
    Bonafonte, Antonio
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2442 - 2445
  • [10] NARROWADAPTIVE REGULARIZATION OF WEIGHTS FOR GRAPHEME-TO-PHONEME CONVERSION
    Kubo, Keigo
    Sakti, Sakriani
    Neubig, Graham
    Toda, Tomoki
    Nakamura, Satoshi
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,