Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization

被引:19
|
作者
Wang, Syu-Siang [1 ]
Chern, Alan [2 ]
Tsao, Yu [2 ]
Hung, Jeih-weih [3 ]
Lu, Xugang [4 ]
Lai, Ying-Hui [2 ]
Su, Borching [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei 10617, Taiwan
[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 11529, Taiwan
[3] Natl Chi Nan Univ, Dept Elect Engn, Nantou 545, Taiwan
[4] Natl Inst Informat & Commun Technol, Tokyo 1840015, Japan
关键词
Discrete wavelet packet transform (DWPT); nonnegative matrix factorization (NMF); short-time Fourier transform (STFT); speech enhancement (SE); NOISE; SUPPRESSION;
D O I
10.1109/LSP.2016.2571727
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
For the state-of-the-art speech enhancement (SE) techniques, a spectrogram is usually preferred than the respective time-domain raw data, since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, two problems can cause distortions in the conventional nonnegative matrix factorization (NMF)-based SE algorithms. One is related to the overlap-and-add operation used in the short-time Fourier transform (STFT)-based signal reconstruction, and the other is concerned with directly using the phase of the noisy speech as that of the enhanced speech in signal reconstruction. These two problems can cause information loss or discontinuity when comparing the clean signal with the reconstructed signal. To solve these two problems, we propose a novel SE method that adopts discrete wavelet packet transform (DWPT) and NMF. In brief, the DWPT is first applied to split a time-domain speech signal into a series of subband signals. Then, we exploit NMF to highlight the speech component for each subband. These enhanced subband signals are joined together via the inverse DWPT to reconstruct a noise-reduced signal in time domain. We evaluate the proposed DWPT-NMF-based SE method on the Mandarin hearing in noise test (MHINT) task. Experimental results show that this new method effectively enhances speech quality and intelligibility and outperforms the conventional STFT-NMF-based SE system.
引用
收藏
页码:1101 / 1105
页数:5
相关论文
共 50 条
  • [1] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
    Bai, Zhigang
    Bao, Changchun
    Yan, Bofang
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
  • [2] SPEECH ENHANCEMENT USING SEGMENTAL NONNEGATIVE MATRIX FACTORIZATION
    Fan, Hao-Teng
    Hung, Jeih-weih
    Lu, Xugang
    Wang, Syu-Siang
    Tsao, Yu
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] Research on Speech Enhancement Based on Nonnegative Matrix Factorization and Improved Genetic Algorithm
    Wang Wenqi
    Zhang Hongjin
    Fu Shan
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 4950 - 4954
  • [4] Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
    Mohammadiha, Nasser
    Smaragdis, Paris
    Leijon, Arne
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2140 - 2151
  • [5] SPEECH ENHANCEMENT USING NONNEGATIVE MATRIX FACTORIZATION WITH TEMPORAL CONTINUITY
    Nam, Seung-Hyon
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2015, 34 (03): : 240 - 246
  • [6] AMPLITUDE-BASED SPEECH ENHANCEMENT WITH NONNEGATIVE MATRIX FACTORIZATION FOR ASYNCHRONOUS DISTRIBUTED RECORDING
    Chiba, Hironobu
    Ono, Nobutaka
    Miyabe, Shigeki
    Takahashi, Yu
    Yamada, Takeshi
    Makino, Shoji
    [J]. 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 203 - 207
  • [7] Speech enhancement based on nonnegative matrix factorization in constant-Q frequency domain
    Xu, Longting
    Wei, Zhilin
    Zaidi, Syed Faham Ali
    Ren, Bo
    Yang, Jichen
    [J]. APPLIED ACOUSTICS, 2021, 174
  • [8] Speech Enhancement Using Convolutive Nonnegative Matrix Factorization with Cosparsity Regularization
    Mirbagheri, Majid
    Xu, Yanbo
    Akram, Sahar
    Shamma, Shihab
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 456 - 459
  • [9] LINEAR DEMIXED DOMAIN MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR SPEECH ENHANCEMENT
    Taniguchi, Toru
    Masuda, Taro
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 476 - 480
  • [10] A NEW LINEAR MMSE FILTER FOR SINGLE CHANNEL SPEECH ENHANCEMENT BASED ON NONNEGATIVE MATRIX FACTORIZATION
    Mohammadiha, Nasser
    Gerkmann, Timo
    Leijon, Arne
    [J]. 2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 45 - 48