Reconstruction of Normal Speech from Whispered Speech based on RBF Neural Network

被引：5

作者：

Tao, Zhi ^{[1
,2
]}

Tan, Xue-Dan ^{[1
]}

Han, Tao ^{[1
]}

Gu, Ji-Hua ^{[1
]}

Xu, Yi-Shen ^{[1
]}

Zhao, He-Ming ^{[2
]}

机构：

[1] Soochow Univ, Dept Phys Sci & Tech, Suzhou, Peoples R China

[2] Soochow Univ, Dept Electron, Suzhou, Peoples R China

来源：

2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010) | 2010年

关键词：

whispered speech; voice conversion; radial basis function neural network;

D O I：

10.1109/IITSI.2010.118

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Restriction of normal speech from Chinese whispered speech based on radial basis function neural network (RBF NN) is proposed in this paper. Firstly, capture the nonlinear mapping of spectral envelope between whispered and normal speech by RBF NN; secondly, modify the spectral envelope of the whispered speech by adopting the trained neural network; finally, convert the whispered speech into normal speech by using the linear spectral pairs (LSP) synthesizer. Both subjective and objective assessments are conducted on the converted speech quality. Simulation results show that the score of the Mean Opinion Score (MOS) is 3.2; the distorted distance of bark spectrum is decreased. Both intelligibility and quality of the converted speech are satisfied.

引用

页码：374 / 377

页数：4

共 50 条

[1] Noisy Speech Recognition Based On RBF Neural Network
Yan Gang
Kong Haidong
Yu Yang
Zheng Xiaoxia
[J]. ADVANCED MATERIALS AND INFORMATION TECHNOLOGY PROCESSING, PTS 1-3, 2011, 271-273 : 597 - 602
[2] Whispered Speech to Normal Speech Conversion Using Bidirectional LSTMs with Meta-network
Yu, WeiWei
Lian, HaiLun
Zhou, Jian
Wang, HuaBin
Tao, Liang
[J]. 2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 251 - 255
[3] VOWEL DURATION IN WHISPERED AND IN NORMAL SPEECH
SHARF, DJ
[J]. LANGUAGE AND SPEECH, 1964, 7 (02) : 89 - 97
[4] WHISPERED AND LOMBARD NEURAL SPEECH SYNTHESIS
Hu, Qiong
Bleisch, Tobias
Petkov, Petko
Raitio, Tuomo
Marchi, Erik
Lakshminarasimhan, Varun
[J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 454 - 461
[5] Performance Analysis of Mandarin Whispered Speech Recognition Based on Normal Speech Training Model
Chen Xueqin
Zhao Heming
Fan Xiaohe
[J]. 2016 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2016, : 548 - 551
[6] Reconstruction of articulatory movements during neutral speech from those during whispered speech
Meenakshi, Nisha G.
Ghosh, Prasanta Kumar
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (06): : 3352 - 3364
[7] Conversion from whispered speech to normal speech using the extended bilinear transformation method
TAO Zhi
ZHAO Heming
TAN Xuedan
GU Jihua
ZHANG Xiaojun
WU Di
[J]. Chinese Journal of Acoustics, 2013, 32 (04) : 425 - 438
[8] A speech endpoint detection algorithm based on entropy and RBF neural network
Zhang, Xueying
Li, Gaoyun
Qiao, Feng
[J]. GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 506 - 509
[9] Noise-Robust Speech Recognition Based on RBF Neural Network
Hou, Xuemei
[J]. HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
[10] A robust Voiced/Unvoiced phoneme classification from whispered speech using the 'color' of whispered phonemes and Deep Neural Network
Meenakshi, G. Nisha
Ghosh, Prasanta Kumar
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 503 - 507

← 1 2 3 4 5 →