Blind Speech Separation and Enhancement With GCC-NMF

被引：40

作者：

Wood, Sean U. N. ^{[1
]}

Rouat, Jean ^{[1
]}

Dupont, Stephane ^{[2
]}

Pironkov, Gueorgui ^{[2
]}

机构：

[1] Univ Sherbrooke, Dept Elect & Comp Engn, NECOTIS, Sherbrooke, PQ J1K 2R1, Canada

[2] Univ Mons, Dept Theory Circuits & Signal Proc, B-7000 Mons, Belgium

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2017年 / 25卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Blind speech separation; CASA; cocktail party problem; GCC; interaural time difference; NMF; PHAT; NONNEGATIVE MATRIX FACTORIZATION; AUDIO SOURCE SEPARATION; INFORMATION; MODELS;

D O I：

10.1109/TASLP.2017.2656805

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present a blind source separation algorithm named GCC-NMF that combines unsupervised dictionary learning via non-negative matrix factorization (NMF) with spatial localization via the generalized cross correlation (GCC) method. Dictionary learning is performed on the mixture signal, with separation subsequently achieved by grouping dictionary atoms, at each point in time, according to their spatial origins. The resulting source separation algorithm is simple yet flexible, requiring no prior knowledge or information. Separation quality is evaluated for three tasks using stereo recordings from the publicly available SiSEC signal separation evaluation campaign: 3 and 4 concurrent speakers in reverberant environments, speech mixed with real-world background noise, and noisy recordings of a moving speaker. Performance is quantified using perceptually motivated and SNR-based measures with the PEASS and BSS Eval toolkits, respectively. We evaluate the effects of model parameters on separation quality, and compare our approach with other unsupervised and semi-supervised speech separation and enhancement approaches. We show that GCC-NMF is a flexible source separation algorithm, outperforming task-specific approaches in each of the three settings, including both blind as well as several informed approaches that require prior knowledge or information.

引用

页码：745 / 755

页数：11

共 50 条

[41] Modular NMF and DNN Speech Enhancement Approach with Update Noise Base
Moghaddam, Hamidreza Asjodi
Seyedin, Sanaz
2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 415 - 419
[42] Speech Enhancement using Non negative Matrix Factorization and Enhanced NMF
Akarsh, K. A.
Selvi, Senthamizh R.
2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
[43] Classifying NMF Components Based on Vector Similarity for Speech and Music Separation
Zheng, Nengheng
Cai, Yi
Li, Xia
Lee, Tan
2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[44] Blind Separation Algorithm of Mixed Minerals Hyperspectral Base on NMF Mode
Wang Jin-hua
Dai Jia-le
Li Meng-qian
Liu Wei
Miao Ruo-fan
SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43 (08) : 2458 - 2466
[45] Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement
Jannu C.
Vanambathina S.D.
International Journal of Speech Technology, 2023, 26 (01) : 197 - 209
[46] NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints
Tu, Ming
Xie, Xiang
Jiao, Yishan
PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 548 - 555
[47] Nonlinear postprocessing for blind speech separation
Kolossa, D
Orglmeister, R
INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION, 2004, 3195 : 832 - 839
[48] Convolutive Blind Speech Separation by Decorrelation
Wang, Fuxiang
Zhang, Jun
ADVANCES IN NEURO-INFORMATION PROCESSING, PT I, 2009, 5506 : 737 - 744
[49] Blind source separation of speech in hardware
Hurley, N
Harte, N
Fearon, C
Rickard, S
2005 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS - DESIGN AND IMPLEMENTATION (SIPS), 2005, : 442 - 445
[50] New automatic forward and backward blind sources separation algorithms for noise reduction and speech enhancement
Djendi, Mohamed
Zoulikha, Meriem
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (07) : 2072 - 2088

← 1 2 3 4 5 →