Hybrid Projective Nonnegative Matrix Factorization With Drum Dictionaries for Harmonic/Percussive Source Separation

被引:2
|
作者
Laroche, Clement [1 ,2 ]
Kowalski, Matthieu [2 ]
Papadopoulos, Helene [2 ]
Richard, Gael [1 ]
机构
[1] Univ Paris Saclay, Telecom ParisTech, LTCI, F-75013 Paris, France
[2] Univ Paris Sud, Cent Supelec, CNRS, UMR 8506,Lab Signaux & Syst, F-91192 Gif Sur Yvette, France
关键词
Nonnegative matrix factorization; projective nonnegative matrix factorization; audio source separation; harmonic/percussive decomposition; POLYPHONIC MUSIC; MELODY EXTRACTION; SPEECH SIGNALS; TRANSCRIPTION; DECOMPOSITION; ALGORITHMS;
D O I
10.1109/TASLP.2018.2830116
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
One of the most general models of music signals considers that such signals can be represented as a sum of two distinct components: a tonal part that is sparse in frequency and temporally stable and a transient (or percussive) part that is composed of short-term broadband sounds. In this paper, we propose a novel hybrid method built upon nonnegative matrix factorization (NMF) that decomposes the time frequency representation of an audio signal into such two components. The tonal part is estimated by a sparse and orthogonal nonnegative decomposition, and the transient part is estimated by a straightforward NMF decomposition constrained by a pre-learned dictionary of smooth spectra. The optimization problem at the heart of our method remains simple with very few hyperparameters and can be solved thanks to simple multiplicative update rules. The extensive benchmark on a large and varied music database against four state of the art harmonic/percussive source separation algorithms demonstrate the merit of the proposed approach.
引用
收藏
页码:1499 / 1511
页数:13
相关论文
共 50 条
  • [41] β-Divergence Two-Dimensional Sparse Nonnegative Matrix Factorization for Audio Source Separation
    Darsono, A. M.
    Haron, N. Z.
    Jaafar, A. S.
    Ahmad, M. I.
    2013 IEEE CONFERENCE ON WIRELESS SENSOR (ICWISE), 2013, : 119 - 123
  • [42] Minimum-Volume Multichannel Nonnegative Matrix Factorization for Blind Audio Source Separation
    Wang, Jianyu
    Guan, Shanzheng
    Liu, Shupei
    Zhang, Xiao-Lei
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 (29) : 3089 - 3103
  • [43] Discriminative Nonnegative Matrix Factorization Using Cross-Reconstruction Error for Source Separation
    Kwon, Kisoo
    Shin, Jong Won
    Kim, Hyung Yong
    Kim, Nam Soo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1513 - 1516
  • [44] Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation
    Pezzoli, Mirco
    Carabias-Orti, Julio Jose
    Cobos, Maximo
    Antonacci, Fabio
    Sarti, Augusto
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 369 - 373
  • [45] Supervised Audio Source Separation Based on Nonnegative Matrix Factorization with Cosine Similarity Penalty
    Iwase, Yuta
    Kitamura, Daichi
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2022, E105A (06) : 906 - 913
  • [46] FLOW-BASED FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR BLIND SOURCE SEPARATION
    Nugraha, Aditya Arie
    Sekiguchi, Kouhei
    Fontaine, Mathieu
    Bando, Yoshiaki
    Yoshii, Kazuyoshi
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 501 - 505
  • [47] Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization
    Kitamura, Daichi
    Ono, Nobutaka
    Sawada, Hiroshi
    Kameoka, Hirokazu
    Saruwatari, Hiroshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) : 1626 - 1641
  • [48] AUTOREGRESSIVE FAST MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR JOINT BLIND SOURCE SEPARATION AND DEREVERBERATION
    Sekiguchi, Kouhei
    Bando, Yoshiaki
    Nugraha, Aditya Arie
    Fontaine, Mathieu
    Yoshii, Kazuyoshi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 511 - 515
  • [49] Online Blind Source Separation Using Incremental Nonnegative Matrix Factorization with Volume Constraint
    Zhou, Guoxu
    Yang, Zuyuan
    Xie, Shengli
    Yang, Jun-Mei
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (04): : 550 - 560
  • [50] Monaural sound source separation by nonnegative matrix factorization with tempora continuity and sparseness criteria
    Virtanen, Tuomas
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1066 - 1074