TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN

被引:0
|
作者
Pandey, Ashutosh [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
noise-independent and speaker-independent speech enhancement; real-time implementation; time domain; temporal convolutional neural network; TCNN; NOISE;
D O I
10.1109/icassp.2019.8683634
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work proposes a fully convolutional neural network (CNN) for real-time speech enhancement in the time domain. The proposed CNN is an encoder-decoder based architecture with an additional temporal convolutional module (TCM) inserted between the encoder and the decoder. We call this architecture a Temporal Convolutional Neural Network (TCNN). The encoder in the TCNN creates a low dimensional representation of a noisy input frame. The TCM uses causal and dilated convolutional layers to utilize the encoder output of the current and previous frames. The decoder uses the TCM output to reconstruct the enhanced frame. The proposed model is trained in a speaker-and noise-independent way. Experimental results demonstrate that the proposed model gives consistently better enhancement results than a state-of-the-art real-time convolutional recurrent model. Moreover, since the model is fully convolutional, it has much fewer trainable parameters than earlier models.
引用
收藏
页码:6875 / 6879
页数:5
相关论文
共 50 条
  • [41] Convolutional Neural Network Model for Fire Detection in Real-Time Environment
    Rehman, Abdul
    Kim, Dongsun
    Paul, Anand
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 2289 - 2307
  • [42] Realizing the Real-time Gaze Redirection System with Convolutional Neural Network
    Hsu, Chih-Fan
    Chen, Yu-Cheng
    Wang, Yu-Shuen
    Lei, Chin-Laung
    Chen, Kuan-Ta
    PROCEEDINGS OF THE 9TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'18), 2018, : 509 - 512
  • [43] Crop seed classification based on a real-time convolutional neural network
    Bakumenko, A.
    Bakhchevnikov, V
    Derkachev, V
    Kovalev, A.
    Lobach, V
    Potipak, M.
    SPIE FUTURE SENSING TECHNOLOGIES (2020), 2020, 11525
  • [44] Real-Time Multilead Convolutional Neural Network for Myocardial Infarction Detection
    Liu, Wenhan
    Zhang, Mengxin
    Zhang, Yidan
    Liao, Yuan
    Huang, Qijun
    Chang, Sheng
    Wang, Hao
    He, Jin
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2018, 22 (05) : 1434 - 1444
  • [45] A Smart Deep Convolutional Neural Network for Real-Time Surface Inspection
    Passos, Adriano G.
    Cousseau, Tiago
    Luersen, Marco A.
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 41 (02): : 583 - 593
  • [46] Real-time Speech Enhancement and Separation with a Unified Deep Neural Network for Single/Dual Talker Scenarios
    Patel, Kashyap
    Kovalyov, Anton
    Panahi, Issa
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 1117 - 1122
  • [47] A Deep Neural Network Based Kalman Filter for Time Domain Speech Enhancement
    Yu, Hongjiang
    Ouyang, Zhiheng
    Zhu, Wei-Ping
    Champagne, Benoit
    Ji, Yunyun
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [48] Real-time Multi-channel Speech Enhancement Based on Neural Network Masking with Attention Model
    Xue, Cheng
    Huang, Weilong
    Chen, Weiguang
    Feng, Jinwei
    INTERSPEECH 2021, 2021, : 1862 - 1866
  • [49] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
    Vanambathina, Sunny Dayal
    Anumola, Vaishnavi
    Tejasree, Ponnapalli
    Divya, R.
    Manaswini, B.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (29) : 45717 - 45732
  • [50] Convolutional gated recurrent unit networks based real-time monaural speech enhancement
    Sunny Dayal Vanambathina
    Vaishnavi Anumola
    Ponnapalli Tejasree
    R. Divya
    B. Manaswini
    Multimedia Tools and Applications, 2023, 82 : 45717 - 45732