Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization

被引:19
|
作者
Wang, Syu-Siang [1 ]
Chern, Alan [2 ]
Tsao, Yu [2 ]
Hung, Jeih-weih [3 ]
Lu, Xugang [4 ]
Lai, Ying-Hui [2 ]
Su, Borching [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei 10617, Taiwan
[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 11529, Taiwan
[3] Natl Chi Nan Univ, Dept Elect Engn, Nantou 545, Taiwan
[4] Natl Inst Informat & Commun Technol, Tokyo 1840015, Japan
关键词
Discrete wavelet packet transform (DWPT); nonnegative matrix factorization (NMF); short-time Fourier transform (STFT); speech enhancement (SE); NOISE; SUPPRESSION;
D O I
10.1109/LSP.2016.2571727
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
For the state-of-the-art speech enhancement (SE) techniques, a spectrogram is usually preferred than the respective time-domain raw data, since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, two problems can cause distortions in the conventional nonnegative matrix factorization (NMF)-based SE algorithms. One is related to the overlap-and-add operation used in the short-time Fourier transform (STFT)-based signal reconstruction, and the other is concerned with directly using the phase of the noisy speech as that of the enhanced speech in signal reconstruction. These two problems can cause information loss or discontinuity when comparing the clean signal with the reconstructed signal. To solve these two problems, we propose a novel SE method that adopts discrete wavelet packet transform (DWPT) and NMF. In brief, the DWPT is first applied to split a time-domain speech signal into a series of subband signals. Then, we exploit NMF to highlight the speech component for each subband. These enhanced subband signals are joined together via the inverse DWPT to reconstruct a noise-reduced signal in time domain. We evaluate the proposed DWPT-NMF-based SE method on the Mandarin hearing in noise test (MHINT) task. Experimental results show that this new method effectively enhances speech quality and intelligibility and outperforms the conventional STFT-NMF-based SE system.
引用
收藏
页码:1101 / 1105
页数:5
相关论文
共 50 条
  • [21] Speech denoising using nonnegative matrix factorization with priors
    Wilson, Kevin W.
    Raj, Bhiksha
    Smaragdis, Paris
    Divakaran, Ajay
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4029 - +
  • [22] Discriminative Layered Nonnegative Matrix Factorization for Speech Separation
    Hsu, Chung-Chien
    Chi, Tai-Shih
    Chien, Jen-Tzung
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 560 - 564
  • [23] Deep Transductive Nonnegative Matrix Factorization for Speech Separation
    Liu, Yalin
    Guan, Naiyang
    Liu, Jie
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 249 - 254
  • [24] UNSUPERVISED BEAMFORMING BASED ON MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR NOISY SPEECH RECOGNITION
    Shimada, Kazuki
    Bando, Yoshiaki
    Mimura, Masato
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5734 - 5738
  • [25] Adaptive recurrent nonnegative matrix factorization with phase compensation for Single-Channel speech enhancement
    Vanita Raj Tank
    Shrinivas Padmakar Mahajan
    [J]. Multimedia Tools and Applications, 2022, 81 : 28249 - 28294
  • [26] Adaptive recurrent nonnegative matrix factorization with phase compensation for Single-Channel speech enhancement
    Tank, Vanita Raj
    Mahajan, Shrinivas Padmakar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (20) : 28249 - 28294
  • [27] Nonnegative matrix factorization applied to nonlinear speech and image cryptosystems
    Xie, Shengli
    Yang, Zuyuan
    Fu, Yuli
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2008, 55 (08) : 2356 - 2367
  • [28] Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis
    Hu, Yonggang
    Zhang, Xiongwei
    Zou, Xia
    Sun, Meng
    Zheng, Yunfei
    Min, Gang
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (08) : 1714 - 1719
  • [29] Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation
    Fontaine, Mathieu
    Sekiguchi, Kouhei
    Nugraha, Aditya Arie
    Bando, Yoshiaki
    Yoshii, Kazuyoshi
    [J]. INTERSPEECH 2021, 2021, : 661 - 665
  • [30] Nonnegative Matrix Factorization
    不详
    [J]. IEEE CONTROL SYSTEMS MAGAZINE, 2021, 41 (03): : 102 - 102