Single channel speech music separation using nonnegative matrix factorization with sliding windows and spectral masks

被引：0

作者：

Grais, Emad M. ^{[1
]}

Erdogan, Hakan ^{[1
]}

机构：

[1] Sabanci Univ, Fac Engn & Nat Sci, TR-34956 Istanbul, Turkey

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

Single channel source separation; source separation; semi-blind source separation; speech music separation; speech processing; nonnegative matrix factorization; Wiener filter;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with sliding windows and spectral masks is proposed in this work. We train a set of basis vectors for each source signal using NMF in the magnitude spectral domain. Rather than forming the columns of the matrices to be decomposed by NMF of a single spectral frame, we build them with multiple spectral frames stacked in one column. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a weighted linear combination of the trained basis vector for both sources. An initial spectrogram estimate for each source is found, and a spectral mask is built using these initial estimates. This mask is used to weight the mixed signal spectrogram to find the contributions of each source signal in the mixed signal. The method is shown to perform better than the conventional NMF approach.

引用

下载

页码：1784 / 1787

页数：4

共 50 条

[31] Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
Grais, Emad M.
Erdogan, Hakan
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1534 - 1537
[32] Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
Grais, Emad M.
Erdogan, Hakan
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1518 - 1521
[33] Spectral Unmixing Using Sparse and Smooth Nonnegative Matrix Factorization
Wu, Changyuan
Shen, Chaomin
2013 21ST INTERNATIONAL CONFERENCE ON GEOINFORMATICS (GEOINFORMATICS), 2013,
[34] Single-Channel Source Separation Using Complex Matrix Factorization
King, Brian J.
Atlas, Les
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2591 - 2597
[35] Adaptation of speaker-specific bases in non-negative matrix factorization for single channel speech-music separation
Grais, Emad M.
Erdogan, Hakan
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 576 - 579
[36] STUDENT'S T NONNEGATIVE MATRIX FACTORIZATION AND POSITIVE SEMIDEFINITE TENSOR FACTORIZATION FOR SINGLE-CHANNEL AUDIO SOURCE SEPARATION
Yoshii, Kazuyoshi
Itoyama, Katsutoshi
Goto, Masataka
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 51 - 55
[37] Speech Enhancement Using Convolutive Nonnegative Matrix Factorization with Cosparsity Regularization
Mirbagheri, Majid
Xu, Yanbo
Akram, Sahar
Shamma, Shihab
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 456 - 459
[38] RECOGNIZE AND SEPARATE APPROACH FOR SPEECH DENOISING USING NONNEGATIVE MATRIX FACTORIZATION
Sohrab, Fahad
Erdogan, Hakan
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1083 - 1087
[39] Enhancement of decomposed spectral coherence using sparse nonnegative matrix factorization
Lee, Jeung-Hoon
Mechanical Systems and Signal Processing, 2021, 157
[40] Spectral-Spatial Hyperspectral Unmixing Using Nonnegative Matrix Factorization
Zhang, Shaoquan
Zhang, Guorong
Li, Fan
Deng, Chengzhi
Wang, Shengqian
Plaza, Antonio
Li, Jun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60

← 1 2 3 4 5 →