LOMBARD SPEECH SYNTHESIS USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS

被引:0
|
作者
Bollepalli, Bajibabu [1 ]
Airaksinen, Manu [1 ]
Alku, Paavo [1 ]
机构
[1] Aalto Univ, Dept Signal Proc & Acoust, Espoo, Finland
基金
芬兰科学院;
关键词
Lombard speech synthesis; adaptation; LSTM-TTS; NOISE; INTELLIGIBILITY; ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In statistical parametric speech synthesis (SPSS), a few studies have investigated the Lombard effect, specifically by using hidden Markov model (HMM) -based systems. Recently, artificial neural networks have demonstrated promising results in SPSS, specifically by using long short-term memory recurrent neural networks (LSTMs). The Lombard effect, however, has not been studied in the LSTM-based speech synthesis systems. In this study, we propose three methods for Lombard speech adaptation in LSTM-based speech synthesis. In particular, (1) we augment Lombard specific information with the linguistic features as input, (2) scale the hidden activations using the learning hidden unit contributions (LHUC) method, and (3) fine-tune the LSTMs trained on normal speech with a small Lombard speech data. To investigate the effectiveness of the proposed methods, we carry out experiments using small (10 utterances) and large (500 utterances) Lombard speech data. Experimental results confirm the adaptability of the LSTMs, and similarity tests show that the LSTMs can achieve significantly better adaptation performance than the HMMs in both small and large data conditions.
引用
收藏
页码:5505 / 5509
页数:5
相关论文
共 50 条
  • [1] Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks
    Bollepalli, Bajibabu
    Juvela, Lauri
    Airaksinen, Manu
    Valentini-Botinhao, Cassia
    Alku, Paavo
    [J]. SPEECH COMMUNICATION, 2019, 110 : 64 - 75
  • [2] Detecting Overlapping Speech with Long Short-Term Memory Recurrent Neural Networks
    Geiger, Juergen T.
    Eyben, Florian
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1667 - 1671
  • [3] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [4] Robust Speech Recognition using Long Short-Term Memory Recurrent Neural Networks for Hybrid Acoustic Modelling
    Geiger, Juergen T.
    Zhang, Zixing
    Weninger, Felix
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 631 - 635
  • [5] Session Based Recommendations Using Recurrent Neural Networks - Long Short-Term Memory
    Dobrovolny, Michal
    Selamat, Ali
    Krejcar, Ondrej
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 53 - 65
  • [6] Classification of Antibacterial Peptides Using Long Short-Term Memory Recurrent Neural Networks
    Youmans, Michael
    Spainhour, John C. G.
    Qiu, Peng
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1134 - 1140
  • [7] Industrial Financial Forecasting using Long Short-Term Memory Recurrent Neural Networks
    Ali, Muhammad Mohsin
    Babar, Muhammad Imran
    Hamza, Muhammad
    Jehanzeb, Muhammad
    Habib, Saad
    Khan, Muhammad Sajid
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 88 - 99
  • [8] Statistical downscaling of precipitation using long short-term memory recurrent neural networks
    Saptarshi Misra
    Sudeshna Sarkar
    Pabitra Mitra
    [J]. Theoretical and Applied Climatology, 2018, 134 : 1179 - 1196
  • [9] LATE REVERBERATION SUPPRESSION USING RECURRENT NEURAL NETWORKS WITH LONG SHORT-TERM MEMORY
    Zhao, Yan
    Wang, DeLiang
    Xu, Buye
    Zhang, Tao
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5434 - 5438
  • [10] Statistical downscaling of precipitation using long short-term memory recurrent neural networks
    Misra, Saptarshi
    Sarkar, Sudeshna
    Mitra, Pabitra
    [J]. THEORETICAL AND APPLIED CLIMATOLOGY, 2018, 134 (3-4) : 1179 - 1196