Convolutional gated recurrent unit networks based real-time monaural speech enhancement

被引:0
|
作者
Sunny Dayal Vanambathina
Vaishnavi Anumola
Ponnapalli Tejasree
R. Divya
B. Manaswini
机构
[1] Department of Electronics and Communications Engineering,
[2] Vellore Institute of Technology,undefined
[3] Andhra Pradesh (VIT-AP),undefined
[4] Computer science and engineering department,undefined
[5] Lakireddy balireddy college of engineering,undefined
来源
关键词
Speech enhancement; Deep learning; Discrete cosine transform; Signal to noise ratio;
D O I
暂无
中图分类号
学科分类号
摘要
Deep-learning based speech enhancement included many applications like improving speech intelligibility and perceptual quality. There are many methods which focus on amplitude spectrum enhancement. In the existing models, computation of the complex layer is huge which leads to a very big challenge to the device. DFT data is complex valued, so computation is difficult since we need to deal with the both real and imaginary parts of the signal at the same time. To reduce the computation, some researchers use the variants of STFT as input, such as amplitude/energy spectrum, Log-Mel spectrum, etc. They all enhance amplitude spectrum without estimating clean phase, this would limit the enhancement performance. In the proposed method DCT is used which is real-valued transformation without information lost and contains implicit phase. This avoids the problem of manually design a complex network to estimate the explicit phase and it will improve the enhancement performance. More research have done on phase spectrum estimation directly and indirectly, but it is not ideal. Recently, complex valued models are proposed like deep complex convolution recurrent network (DCCRN). The computation of the model is very huge. So a Deep Cosine transform convolutional Gated recurrent Unit (DCTCGRU) is proposed to reduce the complexity and improve further performance. GRU can well model the correlation between adjacent frames of noisy speech. The results from the experiment show that DCTCGRU achieves better results in terms of SNR, PESQ and STOI compared with the state-of-the-art algorithms.
引用
收藏
页码:45717 / 45732
页数:15
相关论文
共 50 条
  • [1] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
    Vanambathina, Sunny Dayal
    Anumola, Vaishnavi
    Tejasree, Ponnapalli
    Divya, R.
    Manaswini, B.
    [J]. Multimedia Tools and Applications, 2023, 82 (29): : 45717 - 45732
  • [2] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
    Vanambathina, Sunny Dayal
    Anumola, Vaishnavi
    Tejasree, Ponnapalli
    Divya, R.
    Manaswini, B.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (29) : 45717 - 45732
  • [3] Efficient Gated Convolutional Recurrent Neural Networks for Real-Time Speech Enhancement
    Fazal-E-Wahab
    Ye, Zhongfu
    Saleem, Nasir
    Ali, Hamza
    Ali, Imad
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023,
  • [4] Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement
    Tan, Ke
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 380 - 390
  • [5] Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
    Girirajan, S.
    Pandian, A.
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 1987 - 2001
  • [6] A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
    Tan, Ke
    Wang, DeLiang
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3229 - 3233
  • [7] Convolutional quasi-recurrent network for real-time speech enhancement
    Shi, Yunlong
    Yuan, Wenhao
    Hu, Shaodong
    Lou, Yingxi
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (03): : 183 - 190
  • [8] A Convolutional Gated Recurrent Network for Speech Enhancement
    Yuan, Wen-Hao
    Hu, Shao-Dong
    Shi, Yun-Long
    Li, Zhao
    Liang, Chun-Yan
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (07): : 1276 - 1283
  • [9] Dilated convolutional recurrent neural network for monaural speech enhancement
    Pirhosseinloo, Shadi
    Brumberg, Jonathan S.
    [J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 158 - 162
  • [10] A Subconvolutional U-net with Gated Recurrent Unit and Efficient Channel Attention Mechanism for Real-Time Speech Enhancement
    Yechuri, Sivaramakrishna
    Vanambathina, Sunnydayal
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2024,