Blind Speech Separation and Enhancement With GCC-NMF

被引:40
|
作者
Wood, Sean U. N. [1 ]
Rouat, Jean [1 ]
Dupont, Stephane [2 ]
Pironkov, Gueorgui [2 ]
机构
[1] Univ Sherbrooke, Dept Elect & Comp Engn, NECOTIS, Sherbrooke, PQ J1K 2R1, Canada
[2] Univ Mons, Dept Theory Circuits & Signal Proc, B-7000 Mons, Belgium
基金
加拿大自然科学与工程研究理事会;
关键词
Blind speech separation; CASA; cocktail party problem; GCC; interaural time difference; NMF; PHAT; NONNEGATIVE MATRIX FACTORIZATION; AUDIO SOURCE SEPARATION; INFORMATION; MODELS;
D O I
10.1109/TASLP.2017.2656805
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a blind source separation algorithm named GCC-NMF that combines unsupervised dictionary learning via non-negative matrix factorization (NMF) with spatial localization via the generalized cross correlation (GCC) method. Dictionary learning is performed on the mixture signal, with separation subsequently achieved by grouping dictionary atoms, at each point in time, according to their spatial origins. The resulting source separation algorithm is simple yet flexible, requiring no prior knowledge or information. Separation quality is evaluated for three tasks using stereo recordings from the publicly available SiSEC signal separation evaluation campaign: 3 and 4 concurrent speakers in reverberant environments, speech mixed with real-world background noise, and noisy recordings of a moving speaker. Performance is quantified using perceptually motivated and SNR-based measures with the PEASS and BSS Eval toolkits, respectively. We evaluate the effects of model parameters on separation quality, and compare our approach with other unsupervised and semi-supervised speech separation and enhancement approaches. We show that GCC-NMF is a flexible source separation algorithm, outperforming task-specific approaches in each of the three settings, including both blind as well as several informed approaches that require prior knowledge or information.
引用
收藏
页码:745 / 755
页数:11
相关论文
共 50 条
  • [41] Modular NMF and DNN Speech Enhancement Approach with Update Noise Base
    Moghaddam, Hamidreza Asjodi
    Seyedin, Sanaz
    2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 415 - 419
  • [42] Speech Enhancement using Non negative Matrix Factorization and Enhanced NMF
    Akarsh, K. A.
    Selvi, Senthamizh R.
    2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
  • [43] Classifying NMF Components Based on Vector Similarity for Speech and Music Separation
    Zheng, Nengheng
    Cai, Yi
    Li, Xia
    Lee, Tan
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [44] Blind Separation Algorithm of Mixed Minerals Hyperspectral Base on NMF Mode
    Wang Jin-hua
    Dai Jia-le
    Li Meng-qian
    Liu Wei
    Miao Ruo-fan
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43 (08) : 2458 - 2466
  • [45] Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement
    Jannu C.
    Vanambathina S.D.
    International Journal of Speech Technology, 2023, 26 (01) : 197 - 209
  • [46] NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints
    Tu, Ming
    Xie, Xiang
    Jiao, Yishan
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 548 - 555
  • [47] Nonlinear postprocessing for blind speech separation
    Kolossa, D
    Orglmeister, R
    INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION, 2004, 3195 : 832 - 839
  • [48] Convolutive Blind Speech Separation by Decorrelation
    Wang, Fuxiang
    Zhang, Jun
    ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 737 - 744
  • [49] Blind source separation of speech in hardware
    Hurley, N
    Harte, N
    Fearon, C
    Rickard, S
    2005 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS - DESIGN AND IMPLEMENTATION (SIPS), 2005, : 442 - 445
  • [50] New automatic forward and backward blind sources separation algorithms for noise reduction and speech enhancement
    Djendi, Mohamed
    Zoulikha, Meriem
    COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (07) : 2072 - 2088