Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization

被引：19

作者：

Wang, Syu-Siang ^{[1
]}

Chern, Alan ^{[2
]}

Tsao, Yu ^{[2
]}

Hung, Jeih-weih ^{[3
]}

Lu, Xugang ^{[4
]}

Lai, Ying-Hui ^{[2
]}

Su, Borching ^{[1
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei 10617, Taiwan

[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 11529, Taiwan

[3] Natl Chi Nan Univ, Dept Elect Engn, Nantou 545, Taiwan

[4] Natl Inst Informat & Commun Technol, Tokyo 1840015, Japan

来源：

IEEE SIGNAL PROCESSING LETTERS | 2016年 / 23卷 / 08期

关键词：

Discrete wavelet packet transform (DWPT); nonnegative matrix factorization (NMF); short-time Fourier transform (STFT); speech enhancement (SE); NOISE; SUPPRESSION;

D O I：

10.1109/LSP.2016.2571727

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

For the state-of-the-art speech enhancement (SE) techniques, a spectrogram is usually preferred than the respective time-domain raw data, since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, two problems can cause distortions in the conventional nonnegative matrix factorization (NMF)-based SE algorithms. One is related to the overlap-and-add operation used in the short-time Fourier transform (STFT)-based signal reconstruction, and the other is concerned with directly using the phase of the noisy speech as that of the enhanced speech in signal reconstruction. These two problems can cause information loss or discontinuity when comparing the clean signal with the reconstructed signal. To solve these two problems, we propose a novel SE method that adopts discrete wavelet packet transform (DWPT) and NMF. In brief, the DWPT is first applied to split a time-domain speech signal into a series of subband signals. Then, we exploit NMF to highlight the speech component for each subband. These enhanced subband signals are joined together via the inverse DWPT to reconstruct a noise-reduced signal in time domain. We evaluate the proposed DWPT-NMF-based SE method on the Mandarin hearing in noise test (MHINT) task. Experimental results show that this new method effectively enhances speech quality and intelligibility and outperforms the conventional STFT-NMF-based SE system.

引用

页码：1101 / 1105

页数：5

共 50 条

[1] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
Bai, Zhigang
Bao, Changchun
Yan, Bofang
[J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
[2] SPEECH ENHANCEMENT USING SEGMENTAL NONNEGATIVE MATRIX FACTORIZATION
Fan, Hao-Teng
Hung, Jeih-weih
Lu, Xugang
Wang, Syu-Siang
Tsao, Yu
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[3] Research on Speech Enhancement Based on Nonnegative Matrix Factorization and Improved Genetic Algorithm
Wang Wenqi
Zhang Hongjin
Fu Shan
[J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 4950 - 4954
[4] Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
Mohammadiha, Nasser
Smaragdis, Paris
Leijon, Arne
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2140 - 2151
[5] SPEECH ENHANCEMENT USING NONNEGATIVE MATRIX FACTORIZATION WITH TEMPORAL CONTINUITY
Nam, Seung-Hyon
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2015, 34 (03): : 240 - 246
[6] AMPLITUDE-BASED SPEECH ENHANCEMENT WITH NONNEGATIVE MATRIX FACTORIZATION FOR ASYNCHRONOUS DISTRIBUTED RECORDING
Chiba, Hironobu
Ono, Nobutaka
Miyabe, Shigeki
Takahashi, Yu
Yamada, Takeshi
Makino, Shoji
[J]. 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 203 - 207
[7] Speech enhancement based on nonnegative matrix factorization in constant-Q frequency domain
Xu, Longting
Wei, Zhilin
Zaidi, Syed Faham Ali
Ren, Bo
Yang, Jichen
[J]. APPLIED ACOUSTICS, 2021, 174
[8] Speech Enhancement Using Convolutive Nonnegative Matrix Factorization with Cosparsity Regularization
Mirbagheri, Majid
Xu, Yanbo
Akram, Sahar
Shamma, Shihab
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 456 - 459
[9] LINEAR DEMIXED DOMAIN MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR SPEECH ENHANCEMENT
Taniguchi, Toru
Masuda, Taro
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 476 - 480
[10] A NEW LINEAR MMSE FILTER FOR SINGLE CHANNEL SPEECH ENHANCEMENT BASED ON NONNEGATIVE MATRIX FACTORIZATION
Mohammadiha, Nasser
Gerkmann, Timo
Leijon, Arne
[J]. 2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 45 - 48

← 1 2 3 4 5 →