Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion utilizing Complex Many-to-Many Alignments

被引:10
|
作者
Mousa, Amr El-Desoky [1 ]
Schuller, Bjoern [1 ,2 ]
机构
[1] Univ Passau, Chair Complex & Intelligent Syst, Passau, Germany
[2] Imperial Coll London, Dept Comp, London, England
基金
欧盟地平线“2020”;
关键词
grapheme-to-phoneme conversion; long short-term memory; many-to-many alignments;
D O I
10.21437/Interspeech.2016-1229
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Efficient grapheme-to-phoneme (G2P) conversion models are considered indispensable components to achieve the state-of-the-art performance in modem automatic speech recognition (ASR) and text-to-speech (TTS) systems. The role of these models is to provide such systems with a means to generate accurate pronunciations for unseen words. Recent work in this domain is based on recurrent neural networks (RNN) that are capable of translating grapheme sequences into phoneme sequences taking into account the full context of graphemes. To achieve high performance with these models, utilizing explicit alignment information is found essential. The quality of the G2P model heavily depends on the imposed alignment constraints. In this paper, a novel approach is proposed using complex many-to-many G2P alignments to improve the performance of G2P models based on deep bidirectional long short-term memory (BLSTM) RNNs. Extensive experiments cover models with different numbers of hidden layers, projection layer, input splicing windows, and varying alignment schemes. One observes that complex alignments significantly improve the performance on the publicly available CMUDict US English dataset. We compare our results with previously published results.
引用
收藏
页码:2836 / 2840
页数:5
相关论文
共 50 条
  • [21] Session Based Recommendations Using Recurrent Neural Networks - Long Short-Term Memory
    Dobrovolny, Michal
    Selamat, Ali
    Krejcar, Ondrej
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 53 - 65
  • [22] Classification of Antibacterial Peptides Using Long Short-Term Memory Recurrent Neural Networks
    Youmans, Michael
    Spainhour, John C. G.
    Qiu, Peng
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1134 - 1140
  • [23] Industrial Financial Forecasting using Long Short-Term Memory Recurrent Neural Networks
    Ali, Muhammad Mohsin
    Babar, Muhammad Imran
    Hamza, Muhammad
    Jehanzeb, Muhammad
    Habib, Saad
    Khan, Muhammad Sajid
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 88 - 99
  • [24] A System for Learning Atoms Based on Long Short-Term Memory Recurrent Neural Networks
    Quan, Zhe
    Lin, Xuan
    Wang, Zhi-Jie
    Liu, Yan
    Wang, Fan
    Li, Kenli
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 728 - 733
  • [25] Sequence Discriminative Distributed Training of Long Short-Term Memory Recurrent Neural Networks
    Sak, Hasim
    Vinyals, Oriol
    Heigold, Georg
    Senior, Andrew
    McDermott, Erik
    Monga, Rajat
    Mao, Mark
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1209 - 1213
  • [26] Statistical downscaling of precipitation using long short-term memory recurrent neural networks
    Saptarshi Misra
    Sudeshna Sarkar
    Pabitra Mitra
    [J]. Theoretical and Applied Climatology, 2018, 134 : 1179 - 1196
  • [27] Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural Networks
    Bontemps, Loic
    Van Loi Cao
    McDermott, James
    Nhien-An Le-Khac
    [J]. FUTURE DATA AND SECURITY ENGINEERING, FDSE 2016, 2016, 10018 : 141 - 152
  • [28] Long Short-term Memory based on a Reward/punishment Strategy for Recurrent Neural Networks
    Liu, Jiangjiang
    Luo, Biao
    Yan, Pengfei
    Wang, Ding
    Liu, Derong
    [J]. 2017 32ND YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2017, : 327 - 332
  • [29] LOMBARD SPEECH SYNTHESIS USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS
    Bollepalli, Bajibabu
    Airaksinen, Manu
    Alku, Paavo
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5505 - 5509
  • [30] Action Classification in Soccer Videos with Long Short-Term Memory Recurrent Neural Networks
    Baccouche, Moez
    Mamalet, Franck
    Wolf, Christian
    Garcia, Christophe
    Baskurt, Atilla
    [J]. ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT II, 2010, 6353 : 154 - +