Joint Time-Frequency and Time Domain Learning for Speech Enhancement

被引:0
|
作者
Tang, Chuanxin [1 ]
Luo, Chong [1 ]
Zhao, Zhiyuan [1 ]
Xie, Wenxuan [1 ]
Zeng, Wenjun [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For single-channel speech enhancement, both time-domain and time-frequency-domain methods have their respective pros and cons. In this paper, we present a cross-domain framework named TFT-Net, which takes time-frequency spectrogram as input and produces time-domain waveform as output. Such a framework takes advantage of the knowledge we have about spectrogram and avoids some of the drawbacks that T-F-domain methods have been suffering from. In TFT-Net, we design an innovative dual-path attention block (DAB) to fully exploit correlations along the time and frequency axes. We further discover that a sample-independent DAB (SDAB) achieves a good trade-off between enhanced speech quality and complexity. Ablation studies show that both the cross-domain design and the SDAB block bring large performance gain. When logarithmic MSE is used as the training criteria, TFT-Net achieves the highest SDR and SSNR among state-of-the-art methods on two major speech enhancement benchmarks.
引用
收藏
页码:3816 / 3822
页数:7
相关论文
共 50 条
  • [21] A HYBRID TIME-FREQUENCY DOMAIN ARTICULATORY SPEECH SYNTHESIZER
    SONDHI, MM
    SCHROETER, J
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (07): : 955 - 967
  • [22] Robust Speech Watermarking Procedure in the Time-Frequency Domain
    Srdjan Stanković
    Irena Orović
    Nikola Žarić
    [J]. EURASIP Journal on Advances in Signal Processing, 2008
  • [23] Robust speech watermarking procedure in the time-frequency domain
    Stankovic, Srdjan
    Orovic, Irena
    Zaric, Nikola
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2008, 2008 (1)
  • [24] A Joint Time-Frequency Domain Algorithm for Carrier Frequency Estimation
    Sun, Jinhua
    Ding, Yujie
    Wu, Xiaojun
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 301 - 306
  • [25] A Joint Time-Frequency Domain Transformer for multivariate time series forecasting
    Chen, Yushu
    Liu, Shengzhuo
    Yang, Jinzhe
    Jing, Hao
    Zhao, Wenlai
    Yang, Guangwen
    [J]. NEURAL NETWORKS, 2024, 176
  • [26] A Robust Image Watermarking in the Joint Time-Frequency Domain
    Ozturk, Mahmut
    Akan, Aydin
    Cekic, Yalcin
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2010,
  • [27] A Robust Image Watermarking in the Joint Time-Frequency Domain
    Mahmut Öztürk
    Aydın Akan
    Yalçın Çekiç
    [J]. EURASIP Journal on Advances in Signal Processing, 2010
  • [28] Learning Speech Structure to Improve Time-Frequency Masks
    Bu, Suliang
    Zhao, Yunxin
    Wang, Shaojun
    Han, Mei
    [J]. INTERSPEECH 2021, 2021, : 2731 - 2735
  • [29] Noise estimation based on time-frequency correlation for speech enhancement
    Yuan, Wenhao
    Lin, Jiajun
    An, Wei
    Wang, Yu
    Chen, Ning
    [J]. APPLIED ACOUSTICS, 2013, 74 (05) : 770 - 781
  • [30] Modeling speech signals in the time-frequency domain using GARCH
    Cohen, I
    [J]. SIGNAL PROCESSING, 2004, 84 (12) : 2453 - 2459