Supervised Audio Source Separation Based on Nonnegative Matrix Factorization with Cosine Similarity Penalty

被引:0
|
作者
Iwase, Yuta [1 ]
Kitamura, Daichi [1 ]
机构
[1] Kagawa Coll, Natl Inst Technol, Takamatsu, Kagawa 7618058, Japan
关键词
audio source separation; nonnegative matrix factorization; orthogonality; cosine similarity; DIVERGENCE;
D O I
10.1587/transfun.2021EAP1149
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we aim to improve the performance of audio source separation for monaural mixture signals. For monaural audio source separation, semisupervised nonnegative matrix factorization (SNMF) can achieve higher separation performance by employing small supervised signals. In particular, penalized SNMF (PSNMF) with orthogonality penalty is an effective method. PSNMF forces two basis matrices for target and nontarget sources to be orthogonal to each other and improves the separation accuracy. However, the conventional orthogonality penalty is based on an inner product and does not affect the estimation of the basis matrix properly because of the scale indeterminacy between the basis and activation matrices in NMF. To cope with this problem, a new PSNMF with cosine similarity between the basis matrices is proposed. The experimental comparison shows the efficacy of the proposed cosine similarity penalty in supervised audio source separation.
引用
收藏
页码:906 / 913
页数:8
相关论文
共 50 条
  • [1] Audio Source Separation Based on Nonnegative Matrix Factorization with Graph Harmonic Structure
    Ichita, Tomohiro
    Kyochi, Seisuke
    Imoto, Keisuke
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1148 - 1152
  • [2] DISCRIMINATIVE AND RECONSTRUCTIVE BASIS TRAINING FOR AUDIO SOURCE SEPARATION WITH SEMI-SUPERVISED NONNEGATIVE MATRIX FACTORIZATION
    Kitamura, Daichi
    Ono, Nobutaka
    Saruwatari, Hiroshi
    Takahashi, Yu
    Kondo, Kazunobu
    [J]. 2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
  • [3] Robust Nonnegative Matrix Factorization Based on Cosine Similarity Induced Metric
    Chen, Wen-Sheng
    Chen, Haitao
    Pan, Binbin
    Chen, Bo
    [J]. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: BIG DATA AND MACHINE LEARNING, PT II, 2019, 11936 : 278 - 288
  • [4] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation
    Ozerov, Alexey
    Fevotte, Cedric
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 550 - 563
  • [5] BAYESIAN MULTICHANNEL NONNEGATIVE MATRIX FACTORIZATION FOR AUDIO SOURCE SEPARATION AND LOCALIZATION
    Itakura, Kousuke
    Bando, Yoshiaki
    Nakamura, Eita
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 551 - 555
  • [6] Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation
    Pezzoli, Mirco
    Carabias-Orti, Julio Jose
    Cobos, Maximo
    Antonacci, Fabio
    Sarti, Augusto
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 369 - 373
  • [7] Temporal annotation-based audio source separation using weighted nonnegative matrix factorization
    Duong, Ngoc Q. K.
    Ozerov, Alexey
    Chevallier, Louis
    [J]. 2014 IEEE Fourth International Conference on Consumer Electronics Berlin (ICCE-Berlin), 2014, : 220 - 224
  • [8] Beamspace-Domain Multichannel Nonnegative Matrix Factorization for Audio Source Separation
    Lee, Seokjin
    Park, Sang Ha
    Sung, Koeng-Mo
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (01) : 43 - 46
  • [9] NONNEGATIVE TENSOR FACTORIZATION FOR SOURCE SEPARATION OF LOOPS IN AUDIO
    Smith, Jordan B. L.
    Goto, Masataka
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 171 - 175
  • [10] A STRUCTURED NONNEGATIVE MATRIX FACTORIZATION FOR SOURCE SEPARATION
    Laroche, Clement
    Kowalski, Matthieu
    Papadopoulos, Helene
    Richard, Gael
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2033 - 2037