Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CMNMF

被引:0
|
作者
Munoz-Montoro, Antonio J. [1 ]
Politis, Archontis [2 ]
Drossos, Konstantinos [2 ]
Carabias-Orti, Julio J. [1 ]
机构
[1] Univ Jaen, Telecommun Engn Dept, Jaen, Spain
[2] Tampere Univ, Audio Res Grp, Tampere, Finland
基金
欧洲研究理事会;
关键词
Multichannel Source Separation; Singing Voice; Deep Learning; CMNMF; Spatial Audio; SPATIAL COVARIANCE MODEL; AUDIO SOURCE SEPARATION; NONNEGATIVE MATRIX;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This work addresses the problem of multichannel source separation combining two powerful approaches, multichannel spectral factorization with recent monophonic deep learning (DL) based spectrum inference. Individual source spectra at different channels are estimated with a Masker-Denoiser twin network, able to model long-term temporal patterns of a musical piece. The monophonic source spectrograms are used within a spatial covariance mixing model based on complex-valued multichannel non-negative matrix factorization (CMNMF) that predicts the spatial characteristics of each source. The proposed framework is evaluated on the task of singing voice separation with a large multichannel dataset. Experimental results show that our joint DL+CMNMF method outperforms both the individual monophonic DL-based separation and the multichannel CMNMF baseline methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Unsupervised Deep Unfolded Representation Learning for Singing Voice Separation
    Yuan, Weitao
    Wang, Shengbei
    Wang, Jianming
    Unoki, Masashi
    Wang, Wenwu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3206 - 3220
  • [22] Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation
    Schulze-Forster, Kilian
    Doire, Clement S. J.
    Richard, Gael
    Badeau, Roland
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 (29) : 2382 - 2395
  • [23] Neural Vocoder Feature Estimation for Dry Singing Voice Separation
    Im, Jaekwon
    Choi, Soonbeom
    Yong, Sangeon
    Nam, Juhan
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 809 - 814
  • [24] Multichannel Music Separation with Deep Neural Networks
    Nugraha, Aditya Arie
    Liutkus, Antoine
    Vincent, Emmanuel
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 1748 - 1752
  • [25] SVSGAN: SINGING VOICE SEPARATION VIA GENERATIVE ADVERSARIAL NETWORK
    Fan, Zhe-Cheng
    Lai, Yen-Lin
    Jang, Jyh-Shing R.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 726 - 730
  • [26] Multichannel Audio Source Separation With Deep Neural Networks
    Nugraha, Aditya Arie
    Liutkus, Antoine
    Vincent, Emmanuel
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) : 1652 - 1664
  • [27] Monaural singing voice separation based on high-resolution network
    Zhang Y.
    Niu Z.
    Niu B.
    Chang Y.
    Niu, Zhixian (niuniurose63@163.com), 1600, Beijing University of Aeronautics and Astronautics (BUAA) (46): : 1555 - 1563
  • [28] Improving Singing Voice Separation Using Curriculum Learning on Recurrent Neural Networks
    Kang, Seungtae
    Park, Jeong-Sik
    Jang, Gil-Jin
    APPLIED SCIENCES-BASEL, 2020, 10 (07):
  • [29] Pathological Voice Recognition by Deep Neural Network
    Zhang, Xiaojun
    Tao, Zhi
    Zhao, Heming
    Xu, Tianqi
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 464 - 468
  • [30] Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
    Yi, Yuan-Hao
    Ai, Yang
    Ling, Zhen-Hua
    Dai, Li-Rong
    INTERSPEECH 2019, 2019, : 2593 - 2597