An optimized convolutional neural network for speech enhancement

被引：0

作者：

Karthik A. ^{[1
,2
]}

Mazher Iqbal J.L. ^{[1
]}

机构：

[1] Department of ECE, Veltech Rangarajan Dr Sagunthala R&D Institute of Science and Technology, Chennai

[2] Department of ECE, Institute of Aeronautical Engineering, Hyderabad

来源：

International Journal of Speech Technology | 2023年 / 26卷 / 04期

关键词：

Character error rate; Convolutional neural network; Minimization; Optimization; Recognition; Speech enhancement;

D O I：

10.1007/s10772-023-10073-6

中图分类号：

学科分类号：

摘要：

Speech enhancement is an important property in today’s world because most applications use voice recognition as an important feature for performing operations in it. Perfect recognition of commands is achieved only by recognizing the voice correctly. Hence, the speech signal must be enhanced and free from background noise for the recognition process. In the existing approach, a recurrent convolutional encoder/decoder is used for denoising the speech signal. It utilized the signal-to-noise ratio property for enhancing the speech signal. It removes the noise signal effectively by having a low character error rate. But it does not describe the range of SNR of the noise added to the signal. Hence, in this, optimized deep learning is proposed to enhance the speech signal. AI function deep learning mimics the human brain's ability to analyze data and create patterns for use in making decisions. An optimized convolutional neural network was proposed for enhancing the speech for a different type of signal-to-noise ratio value of noises. Here, the particle swarm optimization process performs tuning the hyper-parameters of the convolutional neural network. The tuning of value is to minimize the character error rate of the signal. The proposed method is realized using MATLAB R2020b software and evaluation takes place by calculating the character error rate, PESQ, and STOI of the signal. Then, the comparison of the proposed and existing method takes place using evaluation metrics with − 5 dB, 0 dB, + 5 dB and + 10 dB. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

引用

页码：1117 / 1129

页数：12

共 50 条

[11] A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
Tan, Ke
Wang, DeLiang
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3229 - 3233
[12] A FULLY CONVOLUTIONAL NEURAL NETWORK FOR COMPLEX SPECTROGRAM PROCESSING IN SPEECH ENHANCEMENT
Ouyangi, Zhiheng
Yu, Hongjiang
Zhu, Wei-Ping
Champagne, Benoit
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5756 - 5760
[13] A Convolutional Neural Network with Non-Local Module for Speech Enhancement
Li, Xiaoqi
Li, Yaxing
Li, Meng
Xu, Shan
Dong, Yuanjie
Sun, Xinrong
Xiong, Shengwu
INTERSPEECH 2019, 2019, : 1796 - 1800
[14] Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
Girirajan, S.
Pandian, A.
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 1987 - 2001
[15] Speech enhancement using deep complex convolutional neural network (DCCNN) model
Iqbal, Yasir
Zhang, Tao
Fahad, Muhammad
Rahman, Sadiq ur
Iqbal, Anjum
Geng, Yanzhang
Zhao, Xin
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (12) : 8675 - 8692
[16] Improving Speech Enhancement in Unseen Noise Using Deep Convolutional Neural Network
Yuan W.-H.
Sun W.-Z.
Xia B.
Ou S.-F.
Zidonghua Xuebao/Acta Automatica Sinica, 2018, 44 (04): : 751 - 759
[17] Low-Power Convolutional Recurrent Neural Network For Monaural Speech Enhancement
Gao, Fei
Guan, Haixin
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 559 - 563
[18] Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients
Mamun, Nursadul
Khorram, Soheil
Hansen, John H. L.
INTERSPEECH 2019, 2019, : 4265 - 4269
[19] An Attention-augmented Fully Convolutional Neural Network for Monaural Speech Enhancement
Xu, Zezheng
Jiang, Ting
Li, Chao
Yu, Jiacheng
2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[20] Speech Enhancement Algorithm Based on a Convolutional Neural Network Reconstruction of the Temporal Envelope of Speech in Noisy Environments
Soleymanpour, Rahim
Soleymanpour, Mohammad
Brammer, Anthony J.
Johnson, Michael T.
Kim, Insoo
IEEE ACCESS, 2023, 11 : 5328 - 5336

← 1 2 3 4 5 →