HARMONIC-PERCUSSIVE SOURCE SEPARATION WITH DEEP NEURAL NETWORKS AND PHASE RECOVERY

被引:0
|
作者
Drossos, Konstantinos [1 ]
Magron, Paul [1 ]
Mimilakis, Stylianos Ioannis [2 ]
Virtanen, Tuomas [1 ]
机构
[1] Tampere Univ Technol, Lab Signal Proc, Tampere, Finland
[2] Fraunhofer IDMT, Ilmenau, Germany
基金
欧盟地平线“2020”; 芬兰科学院; 欧洲研究理事会;
关键词
harmonic/percussive source separation; deep neural networks; MaD TwinNet; phase recovery; sinusoidal model;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD TwinNet) system to this task. MaD TwinNet is a deep learning architecture that has reached state-of-the-art results in monaural singing voice separation. Herein, we propose to apply it to HPSS by using it to estimate the magnitude spectrogram of the percussive source. Then, we retrieve the complex-valued short-time Fourier transform of the sources by means of a phase recovery algorithm, which minimizes the reconstruction error and enforces the phase of the harmonic part to follow a sinusoidal phase model. Experiments conducted on realistic music mixtures show that this novel separation system outperforms the previous state-of-the art kernel additive model approach.
引用
收藏
页码:421 / 425
页数:5
相关论文
共 50 条
  • [31] Hybrid neural networks for ISFET source separation
    Bermejo, S
    Bedoya, G
    Cabestany, J
    SMART SENSORS, ACTUATORS, AND MEMS, PTS 1 AND 2, 2003, 5116 : 109 - 119
  • [32] FULLY COMPLEX DEEP NEURAL NETWORK FOR PHASE-INCORPORATING MONAURAL SOURCE SEPARATION
    Lee, Yuan-Shan
    Wang, Chien-Yao
    Wang, Shu-Fan
    Wang, Jia-Ching
    Wu, Chung-Hsien
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 281 - 285
  • [33] HARMONIC SOURCE MONITORING AND IDENTIFICATION USING NEURAL NETWORKS
    HARTANA, RK
    RICHARDS, GG
    IEEE TRANSACTIONS ON POWER SYSTEMS, 1990, 5 (04) : 1098 - 1104
  • [34] Multichannel Music Separation with Deep Neural Networks
    Nugraha, Aditya Arie
    Liutkus, Antoine
    Vincent, Emmanuel
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 1748 - 1752
  • [35] Phase harmonic correlations and convolutional neural networks
    Mallat, Stephane
    Zhang, Sixin
    Rochette, Gaspar
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (03) : 721 - 747
  • [36] JOINT TRAINING OF DEEP NEURAL NETWORKS FOR MULTI-CHANNEL DEREVERBERATION AND SPEECH SOURCE SEPARATION
    Togami, Masahito
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3032 - 3036
  • [37] Combining Mask Estimates for Single Channel Audio Source Separation using Deep Neural Networks
    Grais, Emad M.
    Roma, Gerard
    Simpson, Andrew J. R.
    Plumbley, Mark D.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3339 - 3343
  • [38] Universal Source Coding of Deep Neural Networks
    Basu, Sourya
    Varshney, Lav R.
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 310 - 319
  • [39] Using Diffraction Deep Neural Networks for Indirect Phase Recovery Based on Zernike Polynomials
    Yuan, Fang
    Sun, Yang
    Han, Yuting
    Chu, Hairong
    Ma, Tianxiang
    Shen, Honghai
    SENSORS, 2024, 24 (02)
  • [40] PHASE RECOVERY WITH BREGMAN DIVERGENCES FOR AUDIO SOURCE SEPARATION
    Magron, Paul
    Vial, Pierre-Hugo
    Oberlin, Thomas
    Fevotte, Cedric
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 516 - 520