Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization

被引:328
|
作者
Mohammadiha, Nasser [1 ]
Smaragdis, Paris [2 ,3 ]
Leijon, Arne [1 ]
机构
[1] KTH Royal Inst Technol, Dept Elect Engn, SE-10044 Stockholm, Sweden
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[3] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
关键词
Bayesian inference; HMM; nonnegative matrix factorization (NMF); PLCA; speech enhancement; SQUARE ERROR ESTIMATION; NOISE; SEPARATION; SIGNALS;
D O I
10.1109/TASL.2013.2270369
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. Compared to traditional unsupervised speech enhancement methods, e. g., Wiener filtering, supervised approaches, such as algorithms based on hidden Markov models (HMM), lead to higher-quality enhanced speech signals. However, the main practical difficulty of these approaches is that for each noise type a model is required to be trained a priori. In this paper, we investigate a new class of supervised speech denoising algorithms using nonnegative matrix factorization (NMF). We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF). To circumvent the mismatch problem between the training and testing stages, we propose two solutions. First, we use an HMM in combination with BNMF (BNMF-HMM) to derive a minimum mean square error (MMSE) estimator for the speech signal with no information about the underlying noise type. Second, we suggest a scheme to learn the required noise BNMF model online, which is then used to develop an unsupervised speech enhancement system. Extensive experiments are carried out to investigate the performance of the proposed methods under different conditions. Moreover, we compare the performance of the developed algorithms with state-of-the-art speech enhancement schemes using various objective measures. Our simulations show that the proposed BNMF-based methods outperform the competing algorithms substantially.
引用
收藏
页码:2140 / 2151
页数:12
相关论文
共 50 条
  • [1] Supervised and Semi-supervised Speech Enhancement Using Weighted Nonnegative Matrix Factorization
    Zou, Xia
    Hu, Yonggang
    Zhang, Xiongwei
    2017 9TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2017,
  • [2] SPEECH ENHANCEMENT USING SEGMENTAL NONNEGATIVE MATRIX FACTORIZATION
    Fan, Hao-Teng
    Hung, Jeih-weih
    Lu, Xugang
    Wang, Syu-Siang
    Tsao, Yu
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] SPEECH ENHANCEMENT USING NONNEGATIVE MATRIX FACTORIZATION WITH TEMPORAL CONTINUITY
    Nam, Seung-Hyon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2015, 34 (03): : 240 - 246
  • [4] Speech Enhancement Using Convolutive Nonnegative Matrix Factorization with Cosparsity Regularization
    Mirbagheri, Majid
    Xu, Yanbo
    Akram, Sahar
    Shamma, Shihab
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 456 - 459
  • [5] Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization
    Wang, Syu-Siang
    Chern, Alan
    Tsao, Yu
    Hung, Jeih-weih
    Lu, Xugang
    Lai, Ying-Hui
    Su, Borching
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1101 - 1105
  • [6] Document classification with unsupervised nonnegative matrix factorization and supervised percetron learning
    Barman, Paresh Chandra
    Lee, Soo-Young
    2007 INTERNATIONAL CONFERENCE ON INFORMATION ACQUISITION, VOLS 1 AND 2, 2007, : 183 - +
  • [7] Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization
    Fontaine, Mathieu
    Sekiguchi, Kouhei
    Nugraha, Aditya Arie
    Yoshii, Kazuyoshi
    INTERSPEECH 2020, 2020, : 4541 - 4545
  • [8] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
    Bai, Zhigang
    Bao, Changchun
    Yan, Bofang
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
  • [9] Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis
    Hu, Yonggang
    Zhang, Xiongwei
    Zou, Xia
    Sun, Meng
    Zheng, Yunfei
    Min, Gang
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (08) : 1714 - 1719
  • [10] Efficient Model Selection for Speech Enhancement Using a Deflation Method for Nonnegative Matrix Factorization
    Kim, Minje
    Smaragdis, Paris
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 537 - 541