Reconstruction of Normal Speech from Whispered Speech based on RBF Neural Network

被引:5
|
作者
Tao, Zhi [1 ,2 ]
Tan, Xue-Dan [1 ]
Han, Tao [1 ]
Gu, Ji-Hua [1 ]
Xu, Yi-Shen [1 ]
Zhao, He-Ming [2 ]
机构
[1] Soochow Univ, Dept Phys Sci & Tech, Suzhou, Peoples R China
[2] Soochow Univ, Dept Electron, Suzhou, Peoples R China
关键词
whispered speech; voice conversion; radial basis function neural network;
D O I
10.1109/IITSI.2010.118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Restriction of normal speech from Chinese whispered speech based on radial basis function neural network (RBF NN) is proposed in this paper. Firstly, capture the nonlinear mapping of spectral envelope between whispered and normal speech by RBF NN; secondly, modify the spectral envelope of the whispered speech by adopting the trained neural network; finally, convert the whispered speech into normal speech by using the linear spectral pairs (LSP) synthesizer. Both subjective and objective assessments are conducted on the converted speech quality. Simulation results show that the score of the Mean Opinion Score (MOS) is 3.2; the distorted distance of bark spectrum is decreased. Both intelligibility and quality of the converted speech are satisfied.
引用
收藏
页码:374 / 377
页数:4
相关论文
共 50 条
  • [1] Noisy Speech Recognition Based On RBF Neural Network
    Yan Gang
    Kong Haidong
    Yu Yang
    Zheng Xiaoxia
    [J]. ADVANCED MATERIALS AND INFORMATION TECHNOLOGY PROCESSING, PTS 1-3, 2011, 271-273 : 597 - 602
  • [2] Whispered Speech to Normal Speech Conversion Using Bidirectional LSTMs with Meta-network
    Yu, WeiWei
    Lian, HaiLun
    Zhou, Jian
    Wang, HuaBin
    Tao, Liang
    [J]. 2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 251 - 255
  • [3] VOWEL DURATION IN WHISPERED AND IN NORMAL SPEECH
    SHARF, DJ
    [J]. LANGUAGE AND SPEECH, 1964, 7 (02) : 89 - 97
  • [4] WHISPERED AND LOMBARD NEURAL SPEECH SYNTHESIS
    Hu, Qiong
    Bleisch, Tobias
    Petkov, Petko
    Raitio, Tuomo
    Marchi, Erik
    Lakshminarasimhan, Varun
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 454 - 461
  • [5] Performance Analysis of Mandarin Whispered Speech Recognition Based on Normal Speech Training Model
    Chen Xueqin
    Zhao Heming
    Fan Xiaohe
    [J]. 2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 548 - 551
  • [6] Reconstruction of articulatory movements during neutral speech from those during whispered speech
    Meenakshi, Nisha G.
    Ghosh, Prasanta Kumar
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (06): : 3352 - 3364
  • [7] Conversion from whispered speech to normal speech using the extended bilinear transformation method
    TAO Zhi
    ZHAO Heming
    TAN Xuedan
    GU Jihua
    ZHANG Xiaojun
    WU Di
    [J]. Chinese Journal of Acoustics, 2013, 32 (04) : 425 - 438
  • [8] A speech endpoint detection algorithm based on entropy and RBF neural network
    Zhang, Xueying
    Li, Gaoyun
    Qiao, Feng
    [J]. GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 506 - 509
  • [9] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    [J]. HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
  • [10] A robust Voiced/Unvoiced phoneme classification from whispered speech using the 'color' of whispered phonemes and Deep Neural Network
    Meenakshi, G. Nisha
    Ghosh, Prasanta Kumar
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 503 - 507