HARMONIC-PERCUSSIVE SOURCE SEPARATION WITH DEEP NEURAL NETWORKS AND PHASE RECOVERY

被引:0
|
作者
Drossos, Konstantinos [1 ]
Magron, Paul [1 ]
Mimilakis, Stylianos Ioannis [2 ]
Virtanen, Tuomas [1 ]
机构
[1] Tampere Univ Technol, Lab Signal Proc, Tampere, Finland
[2] Fraunhofer IDMT, Ilmenau, Germany
基金
欧盟地平线“2020”; 芬兰科学院; 欧洲研究理事会;
关键词
harmonic/percussive source separation; deep neural networks; MaD TwinNet; phase recovery; sinusoidal model;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD TwinNet) system to this task. MaD TwinNet is a deep learning architecture that has reached state-of-the-art results in monaural singing voice separation. Herein, we propose to apply it to HPSS by using it to estimate the magnitude spectrogram of the percussive source. Then, we retrieve the complex-valued short-time Fourier transform of the sources by means of a phase recovery algorithm, which minimizes the reconstruction error and enforces the phase of the harmonic part to follow a sinusoidal phase model. Experiments conducted on realistic music mixtures show that this novel separation system outperforms the previous state-of-the art kernel additive model approach.
引用
收藏
页码:421 / 425
页数:5
相关论文
共 50 条
  • [21] HARMONIC AND PERCUSSIVE SOUND SEPARATION BASED ON MIXED PARTIAL DERIVATIVE OF PHASE SPECTROGRAM
    Akaishi, Natsuki
    Yatabe, Kohei
    Oikawa, Yasuhiro
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 301 - 305
  • [22] Referenceless Performance Evaluation of Audio Source Separation using Deep Neural Networks
    Grais, Emad M.
    Wierstorf, Hagen
    Ward, Dominic
    Mason, Russell
    Plumbley, Mark D.
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [23] Single Channel Speech Source Separation Using Hierarchical Deep Neural Networks
    Noorani, Seyed Majid
    Seyedin, Sanaz
    2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 466 - 470
  • [24] LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS
    Naithani, Gaurav
    Parascandolo, Giambattista
    Barker, Tom
    Pontoppidan, Niels Henrik
    Virtanen, Tuomas
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 272 - 276
  • [25] Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation
    Huang, Po-Sen
    Kim, Minje
    Hasegawa-Johnson, Mark
    Smaragdis, Paris
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2136 - 2147
  • [26] LINEAR MULTICHANNEL BLIND SOURCE SEPARATION BASED ON TIME-FREQUENCY MASK OBTAINED BY HARMONIC/PERCUSSIVE SOUND SEPARATION
    Oyabu, Soichiro
    Kitamura, Daichi
    Yatabe, Kohei
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 201 - 205
  • [27] Group Delay based Music Source Separation using Deep Recurrent Neural Networks
    Sebastian, Jilt
    Murthy, Hema A.
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [28] Discriminative Enhancement for Single Channel Audio Source Separation Using Deep Neural Networks
    Grais, Emad M.
    Roma, Gerard
    Simpson, Andrew J. R.
    Plumbley, Mark D.
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2017), 2017, 10169 : 236 - 246
  • [29] Phase recovery and holographic image reconstruction using deep learning in neural networks
    Rivenson, Yair
    Zhang, Yibo
    Gunaydin, Harun
    Teng, Da
    Ozcan, Aydogan
    LIGHT-SCIENCE & APPLICATIONS, 2018, 7 : 17141 - 17141
  • [30] Phase recovery and holographic image reconstruction using deep learning in neural networks
    Yair Rivenson
    Yibo Zhang
    Harun Günaydın
    Da Teng
    Aydogan Ozcan
    Light: Science & Applications, 2018, 7 : 17141 - 17141