Discriminative Nonnegative Dictionary Learning using Cross-Coherence Penalties for Single Channel Source Separation

被引:0
|
作者
Grais, Emad M. [1 ]
Erdogan, Hakan [1 ]
机构
[1] Sabanci Univ, Fac Engn & Nat Sci, TR-34956 Istanbul, Turkey
关键词
Single channel source separation; nonnegative matrix factorization; discriminative training; dictionary learning; MATRIX FACTORIZATION; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we introduce a new discriminative training method for nonnegative dictionary learning. The new method can be used in single channel source separation (SCSS) applications. In SCSS, nonnegative matrix factorization (NMF) is used to learn a dictionary (a set of basis vectors) for each source in the magnitude spectrum domain. The trained dictionaries are then used in decomposing the mixed signal to find the estimate for each source. Learning discriminative dictionaries for the source signals can improve the separation performance. To achieve discriminative dictionaries, we try to avoid the bases set of one source dictionary from representing the other source signals. We propose to minimize cross-coherence between the dictionaries of all sources in the mixed signal. We incorporate a simplified cross-coherence penalty using a regularized NMF cost function to simultaneously learn discriminative and reconstructive dictionaries. The new regularized NMF update rules that are used to discriminatively train the dictionaries are introduced in this work. Experimental results show that using discriminative training gives better separation results than using conventional NMF.
引用
收藏
页码:808 / 812
页数:5
相关论文
共 50 条
  • [21] Source separation using single channel ICA
    Davies, M. E.
    James, C. J.
    SIGNAL PROCESSING, 2007, 87 (08) : 1819 - 1832
  • [22] Dual transform based joint learning single channel speech separation using generative joint dictionary learning
    Hossain, Md Imran
    Al Mahmud, Tarek Hasan
    Islam, Md Shohidul
    Hossen, Md Bipul
    Khan, Rashid
    Ye, Zhongfu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (20) : 29321 - 29346
  • [23] Dual transform based joint learning single channel speech separation using generative joint dictionary learning
    Md Imran Hossain
    Tarek Hasan Al Mahmud
    Md Shohidul Islam
    Md Bipul Hossen
    Rashid Khan
    Zhongfu Ye
    Multimedia Tools and Applications, 2022, 81 : 29321 - 29346
  • [24] Single-channel blind source separation based on joint dictionary with common sub-dictionary
    Sun L.
    Zhao C.
    Su M.
    Wang F.
    International Journal of Speech Technology, 2018, 21 (1) : 19 - 27
  • [25] Nonnegative matrix factorization 2D with the flexible β-Divergence for Single Channel Source Separation
    Yu, Kaiwen
    Woo, W. L.
    Dlay, S. S.
    2015 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2015), 2015,
  • [26] Machine Learning Source Separation Using Maximum A Posteriori Nonnegative Matrix Factorization
    Gao, Bin
    Woo, W. L.
    Ling, Bingo W-K.
    IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (07) : 1169 - 1179
  • [27] Nonnegative matrix factor 2-D deconvolution for blind single channel source separation
    Schmidt, MN
    Morup, M
    INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION, PROCEEDINGS, 2006, 3889 : 700 - 707
  • [28] Hidden Markov Models as Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
    Grais, Emad M.
    Erdogan, Hakan
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1534 - 1537
  • [29] Gaussian Mixture Gain Priors for Regularized Nonnegative Matrix Factorization in Single-Channel Source Separation
    Grais, Emad M.
    Erdogan, Hakan
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1518 - 1521
  • [30] Representation Learning for Single-Channel Source Separation and Bandwidth Extension
    Zoehrer, Matthias
    Peharz, Robert
    Pernkopf, Franz
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2398 - 2409