Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization

被引:4
|
作者
Fontaine, Mathieu [1 ]
Sekiguchi, Kouhei [1 ,2 ]
Nugraha, Aditya Arie [1 ]
Yoshii, Kazuyoshi [1 ,2 ]
机构
[1] RIKEN, AIP, Tokyo, Japan
[2] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
来源
关键词
speech enhancement; nonnegative matrix factorization; alpha-stable distribution; joint diagonalization; MIXTURES;
D O I
10.21437/Interspeech.2020-3202
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes multichannel speech enhancement based on a probabilistic model of complex source spectrograms for improving the intelligibility of speech corrupted by undesired noise. The univariate complex Gaussian model with the reproductive property supports the additivity of source complex spectrograms and forms the theoretical basis of nonnegative matrix factorization (NMF). Multichannel NMF (MNMF) is an extension of NMF based on the multivariate complex Gaussian model with spatial covariance matrices (SCMs), and its state-of-theart variant called FastMNMF with jointly-diagonalizable SCMs achieves faster decomposition based on the univariate Gaussian model in the transformed domain where all time-frequencychannel elements are independent. Although a heavy-tailed extension of FastMNMF has been proposed to improve the robustness against impulsive noise, the source additivity has never been considered. The multivariate alpha-stable distribution does not have the reproductive property for the shape matrix parameter. This paper, therefore, proposes a heavy-tailed extension called alpha-stable FastMNMF which works in the transformed domain to use a univariate complex ff-stable model, satisfying the reproductive property for any tail lightness parameter ff and allowing the alpha-fractional Wiener filtering based on the element-wise source additivity. The experimental results show that alpha-stable FastMNMF with alpha= 1:8 significantly outperforms Gaussian FastMNMF (alpha=2).
引用
收藏
页码:4541 / 4545
页数:5
相关论文
共 50 条
  • [1] Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation
    Fontaine, Mathieu
    Sekiguchi, Kouhei
    Nugraha, Aditya Arie
    Bando, Yoshiaki
    Yoshii, Kazuyoshi
    [J]. INTERSPEECH 2021, 2021, : 661 - 665
  • [2] UNSUPERVISED BEAMFORMING BASED ON MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR NOISY SPEECH RECOGNITION
    Shimada, Kazuki
    Bando, Yoshiaki
    Mimura, Masato
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5734 - 5738
  • [3] Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
    Mohammadiha, Nasser
    Smaragdis, Paris
    Leijon, Arne
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2140 - 2151
  • [4] Alpha-Stable Matrix Factorization
    Simsekli, Umut
    Liutkus, Antoine
    Cemgil, Ali Taylan
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (12) : 2289 - 2293
  • [5] LINEAR DEMIXED DOMAIN MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR SPEECH ENHANCEMENT
    Taniguchi, Toru
    Masuda, Taro
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 476 - 480
  • [6] Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization
    Wang, Syu-Siang
    Chern, Alan
    Tsao, Yu
    Hung, Jeih-weih
    Lu, Xugang
    Lai, Ying-Hui
    Su, Borching
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1101 - 1105
  • [7] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
    Bai, Zhigang
    Bao, Changchun
    Yan, Bofang
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
  • [8] On microphone arrangement for multichannel speech enhancement based on nonnegative matrix factorization in time-channel domain
    Murase, Yoshikazu
    Chiba, Hironobu
    Ono, Nobutaka
    Miyabe, Shigeki
    Yamada, Takeshi
    Makino, Shoji
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [9] Noise Suppression based on nonnegative matrix factorization for robust speech recognition
    Fan, Hao-teng
    Lin, Pao-han
    Hung, Jeih-weih
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1731 - +
  • [10] SPEECH ENHANCEMENT USING SEGMENTAL NONNEGATIVE MATRIX FACTORIZATION
    Fan, Hao-Teng
    Hung, Jeih-weih
    Lu, Xugang
    Wang, Syu-Siang
    Tsao, Yu
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,