Multi-Band Multi-Resolution Fully Convolutional Neural Networks for Singing Voice Separation

被引：0

作者：

Grais, Emad M. ^{[1
]}

Zhao, Fei ^{[1
]}

Plumbley, Mark D. ^{[2
]}

机构：

[1] Cardiff Metropolitan Univ, Ctr Speech & Language Therapy & Hearing Sci, Cardiff, Wales

[2] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford, Surrey, England

来源：

28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020) | 2021年

基金：

英国工程与自然科学研究理事会;

关键词：

Deep learning; convolutional neural networks; singing voice separation; single channel audio source separation; feature extraction; MUSIC;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural networks with convolutional layers usually process the entire spectrogram of an audio signal with the same time-frequency resolutions, number of filters, and dimensionality reduction scale. According to the constant-Q transform, good features can be extracted from audio signals if the low frequency bands are processed with high frequency resolution filters and the high frequency bands with high time resolution filters. In the spectrogram of a mixture of singing voices and music signals, there is usually more information about the voice in the low frequency bands than the high frequency bands. These raise the need for processing each part of the spectrogram differently. In this paper, we propose a multi-band multi-resolution fully convolutional neural network (MBR-FCN) for singing voice separation. The MBR-FCN processes the frequency bands that have more information about the target signals with more filters and smaller dimensionality reduction scale than the bands with less information. Furthermore, the MBR-FCN processes the low frequency bands with high frequency resolution filters and the high frequency bands with high time resolution filters. Our experimental results show that the proposed MBR-FCN with very few parameters achieves better singing voice separation performance than other deep neural networks.

引用

页码：261 / 265

页数：5

共 50 条

[1] Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation
Grais, Emad M.
Wierstorf, Hagen
Ward, Dominic
Plumbley, Mark D.
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2018), 2018, 10891 : 340 - 350
[2] A MULTI-DILATION AND MULTI-RESOLUTION FULLY CONVOLUTIONAL NETWORK FOR SINGING MELODY EXTRACTION
Gao, Ping
You, Cheng-You
Chi, Tai-Shih
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 551 - 555
[3] Multi-band Masking for Waveform-based Singing Voice Separation
Papantonakis, Panagiotis
Garoufis, Christos
Maragos, Petros
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 249 - 253
[4] Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation
Yuan, Weitao
Dong, Bofei
Wang, Shengbei
Unoki, Masashi
Wang, Wenwu
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 (29) : 807 - 822
[5] Multi-resolution convolutional neural networks for inverse problems
Wang, Feng
Eljarrat, Alberto
Mueller, Johannes
Henninen, Trond R.
Erni, Rolf
Koch, Christoph T.
SCIENTIFIC REPORTS, 2020, 10 (01)
[6] Multi-resolution convolutional neural networks for inverse problems
Feng Wang
Alberto Eljarrat
Johannes Müller
Trond R. Henninen
Rolf Erni
Christoph T. Koch
Scientific Reports, 10
[7] Multi-Resolution for Disparity Estimation with Convolutional Neural Networks
Jammal, Samer
Tillo, Tammam
Xiao, Jimin
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1756 - 1761
[8] Multi-Resolution Convolutional Recurrent Networks
Chien, Jen-Tzung
Huang, Yu-Min
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 2043 - 2048
[9] Multi-resolution convolutional neural networks for fully automated segmentation of acutely injured lungs in multiple species
Gerard, Sarah E.
Herrmann, Jacob
Kaczka, David W.
Musch, Guido
Fernandez-Bustamante, Ana
Reinhardt, Joseph M.
MEDICAL IMAGE ANALYSIS, 2020, 60 (60)
[10] Multi-Resolution Convolutional Residual Neural Networks for Monaural Speech Dereverberation
Zhao, Lei
Zhu, Wenbo
Li, Shengqiang
Luo, Hong
Zhang, Xiao-Lei
Rahardja, Susanto
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2338 - 2351

← 1 2 3 4 5 →