Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks

被引:11
|
作者
van Esch, Daan [1 ]
Chua, Mason [1 ]
Rao, Kanishka [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
LSTM; pronunciation; syllabification; stress;
D O I
10.21437/Interspeech.2016-1419
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Word pronunciations, consisting of phoneme sequences and the associated syllabification and stress patterns, are vital for both speech recognition and text-to-speech (TTS) systems. For speech recognition phoneme sequences for words may be learned from audio data. We train recurrent neural network (RNN) based models to predict the syllabification and stress pattern for such pronunciations making them usable for TTS. We find these RNN models significantly outperform naive rule based models for almost all languages we tested. Further, we find additional improvements to the stress prediction model by using the spelling as features in addition to the phoneme sequence. Finally, we train a single RNN model to predict the phoneme sequence, syllabification and stress for a given word. For several languages, this single RNN outperforms similar models trained specifically for either phoneme sequence or stress prediction. We report an exhaustive comparison of these approaches for twenty languages.
引用
收藏
页码:2841 / 2845
页数:5
相关论文
共 50 条
  • [41] Predicting the number of customer transactions using stacked LSTM recurrent neural networks
    Sebt, M., V
    Ghasemi, S. H.
    Mehrkian, S. S.
    SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)
  • [42] Predicting Host CPU Utilization in Cloud Computing using Recurrent Neural Networks
    Duggan, Martin
    Mason, Karl
    Duggan, Jim
    Howley, Enda
    Barrett, Enda
    2017 12TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2017, : 67 - 72
  • [43] PREDICTING EXTUBATION READINESS IN PEDIATRIC INTENSIVE CARE PATIENTS WITH RECURRENT NEURAL NETWORKS
    Lam, Vi
    Pruitt, Laura
    Carlin, Cameron
    Ledbetter, David
    Aczon, Melissa
    Wetzel, Randall
    CRITICAL CARE MEDICINE, 2019, 47
  • [44] Performance enhancement for masonry creep predicting model using recurrent neural networks
    El-Shafie, Ahmed
    Noureldin, Aboelmagd
    Taha, M. Reda
    Hussain, Aini
    Basri, Hassan
    Engineering Intelligent Systems, 2009, 17 (01): : 29 - 38
  • [45] Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks
    Bhattacharya, Debjyoti
    Kleeblatt, Devon C.
    Statt, Antonia
    Reinhart, Wesley F.
    SOFT MATTER, 2022, 18 (27) : 5037 - 5051
  • [46] Predicting the empirical distribution of video quality scores using recurrent neural networks
    Otroshi Shahreza H.
    Amini A.
    Behroozi H.
    International Journal of Engineering, Transactions B: Applications, 2020, 33 (05): : 984 - 991
  • [47] Predicting future personal life events on twitter via recurrent neural networks
    Khodabakhsh, Maryam
    Kahani, Mohsen
    Bagheri, Ebrahim
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 54 (01) : 101 - 127
  • [48] Success and challenges in predicting TBM penetration rate using recurrent neural networks
    Shan, Feng
    He, Xuzhen
    Armaghani, Danial Jahed
    Zhang, Pin
    Sheng, Daichao
    TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY, 2022, 130
  • [49] Predicting Alzheimer's disease progression using deep recurrent neural networks
    Minh Nguyen
    He, Tong
    An, Lijun
    Alexander, Daniel C.
    Feng, Jiashi
    Yeo, B. T. Thomas
    NEUROIMAGE, 2020, 222
  • [50] Predicting the number of customer transactions using stacked LSTM recurrent neural networks
    M. V. Sebt
    S. H. Ghasemi
    S. S. Mehrkian
    Social Network Analysis and Mining, 2021, 11