Convolutional gated recurrent unit networks based real-time monaural speech enhancement

被引：0

作者：

Sunny Dayal Vanambathina

Vaishnavi Anumola

Ponnapalli Tejasree

R. Divya

B. Manaswini

机构：

[1] Department of Electronics and Communications Engineering,

[2] Vellore Institute of Technology,undefined

[3] Andhra Pradesh (VIT-AP),undefined

[4] Computer science and engineering department,undefined

[5] Lakireddy balireddy college of engineering,undefined

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Speech enhancement; Deep learning; Discrete cosine transform; Signal to noise ratio;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep-learning based speech enhancement included many applications like improving speech intelligibility and perceptual quality. There are many methods which focus on amplitude spectrum enhancement. In the existing models, computation of the complex layer is huge which leads to a very big challenge to the device. DFT data is complex valued, so computation is difficult since we need to deal with the both real and imaginary parts of the signal at the same time. To reduce the computation, some researchers use the variants of STFT as input, such as amplitude/energy spectrum, Log-Mel spectrum, etc. They all enhance amplitude spectrum without estimating clean phase, this would limit the enhancement performance. In the proposed method DCT is used which is real-valued transformation without information lost and contains implicit phase. This avoids the problem of manually design a complex network to estimate the explicit phase and it will improve the enhancement performance. More research have done on phase spectrum estimation directly and indirectly, but it is not ideal. Recently, complex valued models are proposed like deep complex convolution recurrent network (DCCRN). The computation of the model is very huge. So a Deep Cosine transform convolutional Gated recurrent Unit (DCTCGRU) is proposed to reduce the complexity and improve further performance. GRU can well model the correlation between adjacent frames of noisy speech. The results from the experiment show that DCTCGRU achieves better results in terms of SNR, PESQ and STOI compared with the state-of-the-art algorithms.

引用

页码：45717 / 45732

页数：15

共 50 条

[1] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
Vanambathina, Sunny Dayal
Anumola, Vaishnavi
Tejasree, Ponnapalli
Divya, R.
Manaswini, B.
[J]. Multimedia Tools and Applications, 2023, 82 (29): : 45717 - 45732
[2] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
Vanambathina, Sunny Dayal
Anumola, Vaishnavi
Tejasree, Ponnapalli
Divya, R.
Manaswini, B.
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (29) : 45717 - 45732
[3] Efficient Gated Convolutional Recurrent Neural Networks for Real-Time Speech Enhancement
Fazal-E-Wahab
Ye, Zhongfu
Saleem, Nasir
Ali, Hamza
Ali, Imad
[J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023,
[4] Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement
Tan, Ke
Wang, DeLiang
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 380 - 390
[5] Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
Girirajan, S.
Pandian, A.
[J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 1987 - 2001
[6] A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
Tan, Ke
Wang, DeLiang
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3229 - 3233
[7] Convolutional quasi-recurrent network for real-time speech enhancement
Shi, Yunlong
Yuan, Wenhao
Hu, Shaodong
Lou, Yingxi
[J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (03): : 183 - 190
[8] A Convolutional Gated Recurrent Network for Speech Enhancement
Yuan, Wen-Hao
Hu, Shao-Dong
Shi, Yun-Long
Li, Zhao
Liang, Chun-Yan
[J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (07): : 1276 - 1283
[9] Dilated convolutional recurrent neural network for monaural speech enhancement
Pirhosseinloo, Shadi
Brumberg, Jonathan S.
[J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 158 - 162
[10] A Subconvolutional U-net with Gated Recurrent Unit and Efficient Channel Attention Mechanism for Real-Time Speech Enhancement
Yechuri, Sivaramakrishna
Vanambathina, Sunnydayal
[J]. WIRELESS PERSONAL COMMUNICATIONS, 2024,

← 1 2 3 4 5 →