Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation

被引:1
|
作者
Fontaine, Mathieu [1 ,2 ]
Sekiguchi, Kouhei [2 ]
Nugraha, Aditya Arie [2 ]
Bando, Yoshiaki [4 ]
Yoshii, Kazuyoshi [2 ,3 ]
机构
[1] Telecom Paris, LTCI, Inst Polytech Paris, Palaiseau, France
[2] Ctr Adv Intelligence Project AIP, RIKEN, Tokyo 1030027, Japan
[3] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
[4] Natl Inst Adv Ind Sci & Technol, Koto Ku, Tokyo 1350064, Japan
关键词
Nonnegative matrix factorization; blind source separation; probabilistic framework; expectation-maximization; INDEPENDENT VECTOR ANALYSIS; SPEECH ENHANCEMENT; MODEL;
D O I
10.1109/TASLP.2022.3172631
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes heavy-tailed extensions of a state of the art versatile blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) from a unified point of view. The common way of deriving such an extension is to replace the multivariate complex Gaussian distribution in the likelihood function with its heavy-tailed generalization, e.g., the multivariate complex Student's t and leptokurtic generalized Gaussian distributions, and tailor-make the corresponding parameter optimization algorithm. Using a wider class of heavy-tailed distributions called a Gaussian scale mixture (GSM), i.e., a mixture of Gaussian distributions whose variances are perturbed by positive random scalars called impulse variables, we propose GSM-FastMNMF and develop an expectation-maximization algorithm that works even when the probability density function of the impulse variables have no analytical expressions. We show that existing heavy-tailed FastMNMF extensions are instances of GSM-FastMNMF and derive a new instance based on the generalized hyperbolic distribution that include the normal-inverse Gaussian, Student's t, and Gaussian distributions as the special cases. Our experiments show that the normal-inverse Gaussian FastMNMF outperforms the state-of-the-art FastMNMF extensions and ILRMA model in speech enhancement and separation in terms of the signal-to-distortion ratio.
引用
收藏
页码:1734 / 1748
页数:15
相关论文
共 50 条
  • [1] FLOW-BASED FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION
    Nugraha, Aditya Arie
    Sekiguchi, Kouhei
    Fontaine, Mathieu
    Bando, Yoshiaki
    Yoshii, Kazuyoshi
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 501 - 505
  • [2] AUTOREGRESSIVE FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR JOINT BLIND SOURCE SEPARATION AND DEREVERBERATION
    Sekiguchi, Kouhei
    Bando, Yoshiaki
    Nugraha, Aditya Arie
    Fontaine, Mathieu
    Yoshii, Kazuyoshi
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 511 - 515
  • [3] SPARSENESS-BASED MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION
    Higuchi, Takuya
    Yoshioka, Takuya
    Nakatani, Tomohiro
    [J]. 2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
  • [4] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation
    Ozerov, Alexey
    Fevotte, Cedric
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 550 - 563
  • [5] MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION IN CONVOLUTIVE MIXTURES. WITH APPLICATION TO BLIND AUDIO SOURCE SEPARATION.
    Ozerov, Alexey
    Fevotte, Cedric
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3137 - +
  • [6] STUDENT'S T MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION
    Kitamura, Koichi
    Bando, Yoshiaki
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    [J]. 2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
  • [7] Joint Nonnegative Matrix Factorization for Underdetermined Blind Source Separation in Nonlinear Mixtures
    Kopriva, Ivica
    [J]. LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2018), 2018, 10891 : 107 - 115
  • [8] Minimum-Volume Multichannel Nonnegative Matrix Factorization for Blind Audio Source Separation
    Wang, Jianyu
    Guan, Shanzheng
    Liu, Shupei
    Zhang, Xiao-Lei
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3089 - 3103
  • [9] Convolutive Transfer Function-Based Multichannel Nonnegative Matrix Factorization for Overdetermined Blind Source Separation
    Wang, Taihui
    Yang, Feiran
    Yang, Jun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 802 - 815
  • [10] Underdetermined blind source separation using normalized spatial covariance matrix and multichannel nonnegative matrix factorization
    Oh, Son-hook
    Kim, Jung-Han
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2020, 39 (02): : 120 - 130