BLIND SPEECH SEPARATION EMPLOYING DIRECTIONAL STATISTICS IN AN EXPECTATION MAXIMIZATION FRAMEWORK

被引:0
|
作者
Dang Hai Tran Vu [1 ]
Haeb-Umbach, Reinhold [1 ]
机构
[1] Univ Gesamthsch Paderborn, Dept Commun Engn, D-33098 Paderborn, Germany
关键词
Noisy Source Separation; Sparse Signal Separation; EM-Algorithm; Directional Statistics; Speech Enhancement;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose to employ directional statistics in a complex vector space to approach the problem of blind speech separation in the presence of spatially correlated noise. We interpret the values of the short time Fourier transform of the microphone signals to be draws from a mixture of complex Watson distributions, a probabilistic model which naturally accounts for spatial aliasing. The parameters of the density are related to the a priori source probabilities, the power of the sources and the transfer function ratios from sources to sensors. Estimation formulas are derived for these parameters by employing the Expectation Maximization (EM) algorithm. The E-step corresponds to the estimation of the source presence probabilities for each time-frequency bin, while the M-step leads to a maximum signal-to-noise ratio (MaxSNR) beamformer in the presence of uncertainty about the source activity. Experimental results are reported for an implementation in a generalized sidelobe canceller (GSC) like spatial beamforming configuration for 3 speech sources with significant coherent noise in reverberant environments, demonstrating the usefulness of the novel modeling framework.
引用
收藏
页码:241 / 244
页数:4
相关论文
共 50 条
  • [41] The expectation-maximization Viterbi algorithm for blind adaptive channel equalization
    Nguyen, H
    Levy, BC
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2005, 53 (10) : 1671 - 1678
  • [42] DIRECTIONAL SPARSE FILTERING USING WEIGHTED LEHMER MEAN FOR BLIND SEPARATION OF UNBALANCED SPEECH MIXTURES
    Watcharasupat, Karn
    Nguyen, Anh H. T.
    Ooi, Ching-Hui
    Khong, Andy W. H.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4485 - 4489
  • [43] Model-Based Expectation-Maximization Source Separation and Localization
    Mandel, Michael I.
    Weiss, Ron J.
    Ellis, Daniel P. W.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (02): : 382 - 394
  • [44] SPEAKER LOCALIZATION AND SEPARATION USING INCREMENTAL DISTRIBUTED EXPECTATION-MAXIMIZATION
    Dorfan, Yuval
    Cherkassky, Dani
    Gannot, Sharon
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1256 - 1260
  • [45] An expectation-maximization method for spatio-temporal blind source separation using an AR-MOG source model
    Hild, Kenneth E., II
    Attias, Hagal T.
    Nagarajan, Srikantan S.
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (03): : 508 - 519
  • [46] Variable Step-Size Speech Blind Separation Employing Laplacian Normal Mixture Distribution Model
    Zhang, Xueying
    Zhi, Zhenhua
    Zhang, Xiaomei
    2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4, 2008, : 785 - 788
  • [47] An expectation maximization framework for an improved ultrasound-based tissue characterization
    Alessandrini, Martino
    Maggio, Simona
    Poree, Jonathan
    De Marchi, Luca
    Speciale, Nicolo
    Franceschini, Emilie
    Bernard, Olivier
    Basset, Olivier
    MEDICAL IMAGING 2011: ULTRASONIC IMAGING, TOMOGRAPHY, AND THERAPY, 2011, 7968
  • [48] A ROBUST ACTIVE SHAPE MODEL USING AN EXPECTATION-MAXIMIZATION FRAMEWORK
    Santiago, Carlos
    Nascimento, Jacinto C.
    Marques, Jorge S.
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 6076 - 6080
  • [49] Maximization of component disjointness:: A criterion for blind source separation
    Anemueller, Joern
    Independent Component Analysis and Signal Separation, Proceedings, 2007, 4666 : 325 - 332
  • [50] Blind separation of delayed sources based on information maximization
    Torkkola, K
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3509 - 3512