Single-channel speech enhancement by subspace affinity minimization

被引:3
|
作者
Tran, Dung N. [1 ]
Koishida, Kazuhito [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
来源
关键词
speech enhancement; noise reduction; deep neural network; convolutional neural network; regression; subspace affinity;
D O I
10.21437/Interspeech.2020-2982
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In data-driven speech enhancement frameworks, learning informative representations is crucial to obtain a high-quality estimate of the target speech. State-of-the-art speech enhancement methods based on deep neural networks (DNN) commonly learn a single embedding from the noisy input to predict clean speech. This compressed representation inevitably contains both noise and speech information leading to speech distortion and poor noise reduction performance. To alleviate this issue, we proposed to learn from the noisy input separate embeddings for speech and noise and introduced a subspace affinity loss function to prevent information leaking between the two representations. We rigorously proved that minimizing this loss function yields maximally uncorrelated speech and noise representations, which can block information leaking. We empirically showed that our proposed framework outperforms traditional and state-of-the-art speech enhancement methods in various unseen nonstationary noise environments. Our results suggest that learning uncorrelated speech and noise embeddings can improve noise reduction and reduces speech distortion in speech enhancement applications.
引用
收藏
页码:2447 / 2451
页数:5
相关论文
共 50 条
  • [31] On Speech Intelligibility Estimation of Phase-Aware Single-Channel Speech Enhancement
    Gaich, Andreas
    Mowlaee, Pejman
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2553 - 2557
  • [32] STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement
    Krawczyk, Martin
    Gerkmann, Timo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1931 - 1940
  • [33] SINGLE-CHANNEL SPEECH ENHANCEMENT IN A TRANSIENT NOISE ENVIRONMENT BY EXPLOITING SPEECH HARMONICITY
    Wu, Kai
    Reju, V. G.
    Khong, Andy W. H.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5088 - 5092
  • [34] A SPECTRAL CONVERSION BASED SINGLE-CHANNEL SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Huy-Khoi Do
    Quang Vinh Thai
    FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING (ICCEE 2011), 2011, : 583 - +
  • [35] ON SPEECH QUALITY ES TIMATION OF PHASE-AWARE SINGLE-CHANNEL SPEECH ENHANCEMENT
    Gaich, Andreas
    Mowlaee, Pejman
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 216 - 220
  • [36] Glance and gaze: A collaborative learning framework for single-channel speech enhancement
    Li, Andong
    Zheng, Chengshi
    Zhang, Lu
    Li, Xiaodong
    APPLIED ACOUSTICS, 2022, 187
  • [37] Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential
    Mowlaee, Pejman
    Kulmer, Josef
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (08) : 1283 - 1294
  • [38] Two-Stage Temporal Processing for Single-Channel Speech Enhancement
    Samui, Sunzan
    Chakrabarti, Indrajit
    Ghosh, Soumya Kanti
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3723 - 3727
  • [39] Single-channel speech enhancement using Kalman filtering in the modulation domain
    So, Stephen
    Wojcicki, Kamil K.
    Paliwal, Kuldip K.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 993 - 996
  • [40] Single-channel speech enhancement based on joint constrained dictionary learning
    Linhui Sun
    Yunyi Bu
    Pingan Li
    Zihao Wu
    EURASIP Journal on Audio, Speech, and Music Processing, 2021