Stereophonic Music Separation Based on Non-negative Tensor Factorization with Cepstrum Regularization

被引：0

作者：

Seki, Shogo ^{[1
]}

Toda, Tomoki ^{[2
]}

Takeda, Kazuya ^{[1
]}

机构：

[1] Nagoya Univ, Grad Sch Informat Sci, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan

[2] Nagoya Univ, Ctr Informat Technol, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan

来源：

2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2017年

关键词：

AUDIO SOURCE SEPARATION; MATRIX FACTORIZATION; MIXTURES;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents a novel approach to stereophonic music separation based on Non-negative Tensor Factorization (NTF). Stereophonic music is roughly divided into two types; recorded music or synthesized music, which we focus on synthesized one in this paper. Synthesized music signals are often generated as linear combinations of many individual source signals with their mixing gains (i.e., time-invariant amplitude scaling) to each channel signal. Therefore, the synthesized stereophonic music separation is the underdetermined source separation problem where phase components are not helpful for the separation. NTF is one of the effective techniques to handle this problem, decomposing amplitude spectrograms of the stereo channel music signal into basis vectors and activations of individual music source signals and their corresponding mixing gains. However, it is essentially difficult to obtain sufficient separation performance in this separation problem as available acoustic cues for separation are limited. To address this issue, we propose a cepstrum regularization method for NTF-based stereo channel separation. The proposed method makes the separated music source signals follow the corresponding Gaussian mixture models of individual music source signals, which are trained in advance using their available samples. An experimental evaluation using real music signals is conducted to investigate the effectiveness of the proposed method in both supervised and unsupervised separation frameworks. The experimental results demonstrate that the proposed method yields significant improvements in separation performance in both frameworks.

引用

页码：981 / 985

页数：5

共 50 条

[21] Music signal separation using supervised robust non-negative matrix factorization with β-divergence
Li F.
Chang H.
International Journal of Circuits, Systems and Signal Processing, 2021, 15 : 149 - 154
[22] MUSIC GENRE CLASSIFICATION VIA TOPOLOGY PRESERVING NON-NEGATIVE TENSOR FACTORIZATION AND SPARSE REPRESENTATIONS
Panagakis, Yannis
Kotropoulos, Constantine
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 249 - 252
[23] Speech/Music Separation Using Non-negative Matrix Factorization with Combination of Cost Functions
Nasersharif, Babak
Abdali, Sara
2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 107 - 111
[24] Source Separation Based on Non-Negative Matrix Factorization of the Synchrosqueezing Transform
Singh, Neha
Meignen, Sylvain
Oberlin, Thomas
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1910 - 1914
[25] Non-negative matrix factorization for polyphonic music transcription
Smaragdis, P
Brown, JC
2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 177 - 180
[26] Orthogonal Non-negative Tensor Factorization based Multi-view Clustering
Li, Jing
Gao, Quanxue
Wang, Qianqian
Yang, Ming
Xia, Wei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[27] Non-negative tensor factorization for vibration-based local damage detection
Gabor, Mateusz
Zdunek, Rafal
Zimroz, Radoslaw
Wodecki, Jacek
Wylomanska, Agnieszka
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2023, 198
[28] SOURCE SEPARATION WITH SCATTERING NON-NEGATIVE MATRIX FACTORIZATION
Bruna, Joan
Sprechmann, Pablo
LeCun, Yann
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1876 - 1880
[29] A non-negative tensor factorization model for selectional preference induction
Van de Cruys, Tim
NATURAL LANGUAGE ENGINEERING, 2010, 16 : 417 - 437
[30] Non-negative tensor factorization models for Bayesian audio processing
Simsekli, Umut
Virtanen, Tuomas
Cemgil, Ali Taylan
DIGITAL SIGNAL PROCESSING, 2015, 47 : 178 - 191

← 1 2 3 4 5 →