Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization

被引:4
|
作者
Fontaine, Mathieu [1 ]
Sekiguchi, Kouhei [1 ,2 ]
Nugraha, Aditya Arie [1 ]
Yoshii, Kazuyoshi [1 ,2 ]
机构
[1] RIKEN, AIP, Tokyo, Japan
[2] Kyoto Univ, Grad Sch Informat, Kyoto, Japan
来源
关键词
speech enhancement; nonnegative matrix factorization; alpha-stable distribution; joint diagonalization; MIXTURES;
D O I
10.21437/Interspeech.2020-3202
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes multichannel speech enhancement based on a probabilistic model of complex source spectrograms for improving the intelligibility of speech corrupted by undesired noise. The univariate complex Gaussian model with the reproductive property supports the additivity of source complex spectrograms and forms the theoretical basis of nonnegative matrix factorization (NMF). Multichannel NMF (MNMF) is an extension of NMF based on the multivariate complex Gaussian model with spatial covariance matrices (SCMs), and its state-of-theart variant called FastMNMF with jointly-diagonalizable SCMs achieves faster decomposition based on the univariate Gaussian model in the transformed domain where all time-frequencychannel elements are independent. Although a heavy-tailed extension of FastMNMF has been proposed to improve the robustness against impulsive noise, the source additivity has never been considered. The multivariate alpha-stable distribution does not have the reproductive property for the shape matrix parameter. This paper, therefore, proposes a heavy-tailed extension called alpha-stable FastMNMF which works in the transformed domain to use a univariate complex ff-stable model, satisfying the reproductive property for any tail lightness parameter ff and allowing the alpha-fractional Wiener filtering based on the element-wise source additivity. The experimental results show that alpha-stable FastMNMF with alpha= 1:8 significantly outperforms Gaussian FastMNMF (alpha=2).
引用
收藏
页码:4541 / 4545
页数:5
相关论文
共 50 条
  • [41] Reducing Computational Complexity of Multichannel Nonnegative Matrix Factorization Using Initial Value Setting for Speech Recognition
    Izumi, Taiki
    Aihara, Ryo
    Hanazawa, Toshiyuki
    Okato, Yohei
    Uramoto, Takanobu
    Uenohara, Shingo
    Furuya, Ken'ichi
    [J]. COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS, 2019, 772 : 893 - 900
  • [42] Convolutive Nonnegative Matrix Factorization with Markov Random Field Smoothing for Blind Unmixing of Multichannel Speech Recordings
    Zdunek, Rafal
    [J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 25 - 32
  • [43] Local Sparsity Based Online Dictionary Learning for Environment-Adaptive Speech Enhancement with Nonnegative Matrix Factorization
    Jeon, Kwang Myung
    Kim, Hong Kook
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2861 - 2865
  • [44] A fast algorithm for hyperspectral unmixing based on constrained nonnegative matrix factorization
    Liu, Jian-Jun
    Wu, Ze-Bin
    Wei, Zhi-Hui
    Xiao, Liang
    Sun, Le
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2013, 41 (03): : 432 - 437
  • [45] Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation
    Pezzoli, Mirco
    Carabias-Orti, Julio Jose
    Cobos, Maximo
    Antonacci, Fabio
    Sarti, Augusto
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 369 - 373
  • [46] Efficient parallel kernel based on Cholesky decomposition to accelerate multichannel nonnegative matrix factorization
    Antonio J. Muñoz-Montoro
    Julio J. Carabias-Orti
    Daniele Salvati
    Raquel Cortina
    [J]. The Journal of Supercomputing, 2023, 79 : 20649 - 20664
  • [47] Efficient parallel kernel based on Cholesky decomposition to accelerate multichannel nonnegative matrix factorization
    Munoz-Montoro, Antonio J.
    Carabias-Orti, Julio J.
    Salvati, Daniele
    Cortina, Raquel
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (18): : 20649 - 20664
  • [48] A robust DOA estimator based on the correntropy in alpha-stable noise environments
    Wang, Peng
    Qiu, Tian-shuang
    Ren, Fu-quan
    Song, Ai-min
    [J]. DIGITAL SIGNAL PROCESSING, 2017, 60 : 242 - 251
  • [49] Adaptive recurrent nonnegative matrix factorization with phase compensation for Single-Channel speech enhancement
    Tank, Vanita Raj
    Mahajan, Shrinivas Padmakar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (20) : 28249 - 28294
  • [50] Adaptive recurrent nonnegative matrix factorization with phase compensation for Single-Channel speech enhancement
    Vanita Raj Tank
    Shrinivas Padmakar Mahajan
    [J]. Multimedia Tools and Applications, 2022, 81 : 28249 - 28294