Real time speech enhancement using densely connected neural networks and Squeezed temporal convolutional modules

被引:1
|
作者
Vanambathina, Sunny Dayal [1 ]
Burra, Manaswini [2 ]
Edupalli, Bhumika [1 ]
Vallem, Eswar Reddy [3 ]
Nellore, Venkata Sravani [3 ]
机构
[1] Vellore Inst Technol Andhra Pradesh VIT AP, Dept Elect & Commun Engn, Amaravathi 522237, India
[2] Potti Sriramulu Chalavadhi Mallikarjuna Rao Coll E, CSE DS Dept, Vijayawada, Andhra Pradesh, India
[3] Vellore Inst Technol Andhra Pradesh VIT AP, Dept Comp Sciene Engn, Amaravathi 522237, India
关键词
Speech Enhancement; Convolutional neural networks; Squeezed temporal convolutional network; PESQ; STOI; NOISE; SEPARATION; END;
D O I
10.1007/s11042-023-17492-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a fully convolutional neural network for enhancing real-time speech in the time domain. Skip connections are included in the architecture of the proposed encoder-decoder network. The layers in the decoder and encoder consists of densely connected blocks (DCB) with causal and dilated convolutions. These dilated convolutions facilitate the aggregation of contextual information across multiple resolutions. The network is ideal for real-time applications due to the causal convolutions' utilization of information inflow prevention from subsequent frames. Additionally, we propose employing up sampling in the decoder with sub-pixel convolutional layers. We also proposed a Squeezed temporal convolutional network (STCNs) after every dense block in encoder and decoder. According to experimental outcomes, the suggested model greatly surpasses previous state-of-the-art models in quality scores as well as objective intelligibility in real-time scenarios.
引用
收藏
页码:50289 / 50305
页数:17
相关论文
共 50 条
  • [1] Real time speech enhancement using densely connected neural networks and Squeezed temporal convolutional modules
    Sunny Dayal Vanambathina
    Manaswini Burra
    Bhumika Edupalli
    Eswar Reddy Vallem
    Venkata Sravani Nellore
    [J]. Multimedia Tools and Applications, 2024, 83 : 50289 - 50305
  • [2] DCT based densely connected convolutional GRU for real-time speech enhancement
    Jannu, Chaitanya
    Vanambathina, Sunny Dayal
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) : 1195 - 1208
  • [3] Densely connected convolutional networks for speech recognition
    Li, Chia Yu
    Vu, Ngoc Thang
    [J]. Speech Communication - 13th ITG-Fachtagung Sprachkommunikation, 2020, : 321 - 325
  • [4] DENSELY CONNECTED NEURAL NETWORK WITH DILATED CONVOLUTIONS FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN
    Pandey, Ashutosh
    Wang, DeLiang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6629 - 6633
  • [5] Efficient densely connected convolutional neural networks
    Li, Guoqing
    Zhang, Meng
    Li, Jiaojie
    Lv, Feng
    Tong, Guodong
    [J]. PATTERN RECOGNITION, 2021, 109
  • [6] TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN
    Pandey, Ashutosh
    Wang, DeLiang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6875 - 6879
  • [7] Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks
    Li, Jingdong
    Zhang, Hui
    Zhang, Xueliang
    Li, Changliang
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 896 - 900
  • [8] Speech synthesis from ECoG using densely connected 3D convolutional neural networks
    Angrick, Miguel
    Herff, Christian
    Mugler, Emily
    Tate, Matthew C.
    Slutzky, Marc W.
    Krusienski, Dean J.
    Schultz, Tanja
    [J]. JOURNAL OF NEURAL ENGINEERING, 2019, 16 (03)
  • [9] Enhancement of Perivascular Spaces Using Densely Connected Deep Convolutional Neural Network
    Jung, Euijin
    Chikontwe, Philip
    Zong, Xiaopeng
    Lin, Weili
    Shen, Dinggang
    Park, Sang Hyun
    [J]. IEEE ACCESS, 2019, 7 : 18382 - 18391
  • [10] Efficient Gated Convolutional Recurrent Neural Networks for Real-Time Speech Enhancement
    Fazal-E-Wahab
    Ye, Zhongfu
    Saleem, Nasir
    Ali, Hamza
    Ali, Imad
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023,