Single-channel speech enhancement by subspace affinity minimization

被引:3
|
作者
Tran, Dung N. [1 ]
Koishida, Kazuhito [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
来源
关键词
speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;
D O I
10.21437/Interspeech.2020-2982
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.
引用
收藏
页码:2447 / 2451
页数:5
相关论文
共 50 条
  • [21] Single-channel speech enhancement based on frequency domain ALE
    Nakanishi, Isao
    Nagata, Yuudai
    Itoh, Yoshio
    Fukui, Yutaka
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 2541 - 2544
  • [22] Single-channel speech enhancement using learnable loss mixup
    Chang, Oscar
    Tran, Dung N.
    Koishida, Kazuhito
    INTERSPEECH 2021, 2021, : 2696 - 2700
  • [23] A two-stage method for single-channel speech enhancement
    Hamid, ME
    Fukabayashi, T
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2006, E89A (04) : 1058 - 1068
  • [24] ON PHASE IMPORTANCE IN PARAMETER ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT
    Mowlaee, Pejman
    Saeidi, Rahim
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7462 - 7466
  • [25] Modified Amplitude Spectral Estimator for Single-Channel Speech Enhancement
    Zhai, Zhenhui
    Ou, Shifeng
    Gao, Ying
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS (AMEII 2016), 2016, 73 : 1115 - 1120
  • [26] Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise
    Doire, Clement S. J.
    Brookes, Mike
    Naylor, Patrick A.
    Hicks, Christopher M.
    Betts, Dave
    Dmour, Mohammad A.
    Jensen, Soren Holdt
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 572 - 587
  • [27] SINGLE-CHANNEL SPEECH ENHANCEMENT WITH SEQUENTIALLY TRAINED DNN SYSTEM
    Sun, Yang
    Xian, Yang
    Wang, Wenwu
    Naqvi, Syed Mohsen
    2019 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2019,
  • [28] Deep Neural Network for Supervised Single-Channel Speech Enhancement
    Saleem, Nasir
    Irfan Khattak, Muhammad
    Ali, Muhammad Yousaf
    Shafi, Muhammad
    ARCHIVES OF ACOUSTICS, 2019, 44 (01) : 3 - 12
  • [29] INVESTIGATION OF A PARAMETRIC GAIN APPROACH TO SINGLE-CHANNEL SPEECH ENHANCEMENT
    Huang, Gongping
    Chen, Jingdong
    Benesty, Jacob
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 206 - 210
  • [30] SPEAKER AND NOISE INDEPENDENT ONLINE SINGLE-CHANNEL SPEECH ENHANCEMENT
    Germain, Francois G.
    Mysore, Gautham J.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 71 - 75