Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization

被引：4

作者：

Fontaine, Mathieu ^{[1
]}

Sekiguchi, Kouhei ^{[1
,2
]}

Nugraha, Aditya Arie ^{[1
]}

Yoshii, Kazuyoshi ^{[1
,2
]}

机构：

[1] RIKEN, AIP, Tokyo, Japan

[2] Kyoto Univ, Grad Sch Informat, Kyoto, Japan

来源：

INTERSPEECH 2020 | 2020年

关键词：

speech enhancement; nonnegative matrix factorization; alpha-stable distribution; joint diagonalization; MIXTURES;

D O I：

10.21437/Interspeech.2020-3202

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

This paper describes multichannel speech enhancement based on a probabilistic model of complex source spectrograms for improving the intelligibility of speech corrupted by undesired noise. The univariate complex Gaussian model with the reproductive property supports the additivity of source complex spectrograms and forms the theoretical basis of nonnegative matrix factorization (NMF). Multichannel NMF (MNMF) is an extension of NMF based on the multivariate complex Gaussian model with spatial covariance matrices (SCMs), and its state-of-theart variant called FastMNMF with jointly-diagonalizable SCMs achieves faster decomposition based on the univariate Gaussian model in the transformed domain where all time-frequencychannel elements are independent. Although a heavy-tailed extension of FastMNMF has been proposed to improve the robustness against impulsive noise, the source additivity has never been considered. The multivariate alpha-stable distribution does not have the reproductive property for the shape matrix parameter. This paper, therefore, proposes a heavy-tailed extension called alpha-stable FastMNMF which works in the transformed domain to use a univariate complex ff-stable model, satisfying the reproductive property for any tail lightness parameter ff and allowing the alpha-fractional Wiener filtering based on the element-wise source additivity. The experimental results show that alpha-stable FastMNMF with alpha= 1:8 significantly outperforms Gaussian FastMNMF (alpha=2).

引用

页码：4541 / 4545

页数：5

共 50 条

[1] Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation
Fontaine, Mathieu
Sekiguchi, Kouhei
Nugraha, Aditya Arie
Bando, Yoshiaki
Yoshii, Kazuyoshi
[J]. INTERSPEECH 2021, 2021, : 661 - 665
[2] UNSUPERVISED BEAMFORMING BASED ON MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR NOISY SPEECH RECOGNITION
Shimada, Kazuki
Bando, Yoshiaki
Mimura, Masato
Itoyama, Katsutoshi
Yoshii, Kazuyoshi
Kawahara, Tatsuya
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5734 - 5738
[3] Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
Mohammadiha, Nasser
Smaragdis, Paris
Leijon, Arne
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2140 - 2151
[4] Alpha-Stable Matrix Factorization
Simsekli, Umut
Liutkus, Antoine
Cemgil, Ali Taylan
[J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (12) : 2289 - 2293
[5] LINEAR DEMIXED DOMAIN MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR SPEECH ENHANCEMENT
Taniguchi, Toru
Masuda, Taro
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 476 - 480
[6] Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization
Wang, Syu-Siang
Chern, Alan
Tsao, Yu
Hung, Jeih-weih
Lu, Xugang
Lai, Ying-Hui
Su, Borching
[J]. IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1101 - 1105
[7] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
Bai, Zhigang
Bao, Changchun
Yan, Bofang
[J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
[8] On microphone arrangement for multichannel speech enhancement based on nonnegative matrix factorization in time-channel domain
Murase, Yoshikazu
Chiba, Hironobu
Ono, Nobutaka
Miyabe, Shigeki
Yamada, Takeshi
Makino, Shoji
[J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[9] Noise Suppression based on nonnegative matrix factorization for robust speech recognition
Fan, Hao-teng
Lin, Pao-han
Hung, Jeih-weih
[J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1731 - +
[10] SPEECH ENHANCEMENT USING SEGMENTAL NONNEGATIVE MATRIX FACTORIZATION
Fan, Hao-Teng
Hung, Jeih-weih
Lu, Xugang
Wang, Syu-Siang
Tsao, Yu
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →