Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

被引：328

作者：

Mohammadiha, Nasser ^{[1
]}

Smaragdis, Paris ^{[2
,3
]}

Leijon, Arne ^{[1
]}

机构：

[1] KTH Royal Inst Technol, Dept Elect Engn, SE-10044 Stockholm, Sweden

[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA

[3] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 10期

关键词：

Bayesian inference; HMM; nonnegative matrix factorization (NMF); PLCA; speech enhancement; SQUARE ERROR ESTIMATION; NOISE; SEPARATION; SIGNALS;

D O I：

10.1109/TASL.2013.2270369

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e. g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supervised speech denoising algorithms using nonnegative matrix factorization (NMF). We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF). To circumvent the mismatch problem between the training and testing stages, we propose two solutions. First, we use an HMM in combination with BNMF (BNMF-HMM) to derive a minimum mean square error (MMSE) estimator for the speech signal with no information about the underlying noise type. Second, we suggest a scheme to learn the required noise BNMF model online, which is then used to develop an unsupervised speech enhancement system. Extensive experiments are carried out to investigate the performance of the proposed methods under different conditions. Moreover, we compare the performance of the developed algorithms with state-of-the-art speech enhancement schemes using various objective measures. Our simulations show that the proposed BNMF-based methods outperform the competing algorithms substantially.

引用

页码：2140 / 2151

页数：12

共 50 条

[31] Enhancement of decomposed spectral coherence using sparse nonnegative matrix factorization
Lee, Jeung-Hoon
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2021, 157
[32] Accelerated sparse nonnegative matrix factorization for unsupervised feature learning
Xie, Ting
Zhang, Hua
Liu, Ruihua
Xiao, Hanguang
PATTERN RECOGNITION LETTERS, 2022, 156 : 46 - 52
[33] ULTRASOUND-COUPLED SEMI-SUPERVISED NONNEGATIVE MATRIX FACTORISATION FOR SPEECH ENHANCEMENT
Barker, Tom
Virtanen, Tuomas
Delhomme, Olivier
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[34] Discriminative Layered Nonnegative Matrix Factorization for Speech Separation
Hsu, Chung-Chien
Chi, Tai-Shih
Chien, Jen-Tzung
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 560 - 564
[35] Transductive Convolutive Nonnegative Matrix Factorization for Speech Separation
Mai, Yaodan
Lan, Long
Guan, Naiyang
Zhang, Xiang
Luo, Zhigang
PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 1400 - 1404
[36] Unsupervised EEG channel selection based on nonnegative matrix factorization
Xu, Lingfeng
Chavez-Echeagaray, Maria Elena
Berisha, Visar
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 76
[37] Smooth Nonnegative Matrix Factorization for Unsupervised Audiovisual Document Structuring
Essid, Slim
Fevotte, Cedric
IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (02) : 415 - 425
[38] Deep Transductive Nonnegative Matrix Factorization for Speech Separation
Liu, Yalin
Guan, Naiyang
Liu, Jie
2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 249 - 254
[39] Nonnegative Matrix Factorization Using Nonnegative Polynomial Approximations
Debals, Otto
Van Barel, Marc
De Lathauwer, Lieven
IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (07) : 948 - 952
[40] A NEW LINEAR MMSE FILTER FOR SINGLE CHANNEL SPEECH ENHANCEMENT BASED ON NONNEGATIVE MATRIX FACTORIZATION
Mohammadiha, Nasser
Gerkmann, Timo
Leijon, Arne
2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2011, : 45 - 48

← 1 2 3 4 5 →