On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks

被引：8

作者：

Sulun, Serkan ^{[1
]}

Davies, Matthew E. P. ^{[2
]}

机构：

[1] Inst Syst & Comp Engn Technol & Sci INESC TEC, P-4200465 Porto, Portugal

[2] Univ Coimbra, Ctr Informat & Syst, Dept Informat Engn, P-3030790 Coimbra, Portugal

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2021年 / 15卷 / 01期

关键词：

Training; Testing; Wideband; Signal to noise ratio; Training data; Noise reduction; Neural networks; Audio bandwidth extension; audio enhancement; deep neural networks; generalization; regularization; overfitting; SPEECH;

D O I：

10.1109/JSTSP.2020.3037485

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we address a subtopic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output. Our main contribution centers on the impact of the choice of low-pass filter when training and subsequently testing the network. For two different state-of-the-art deep architectures, ResNet and U-Net, we demonstrate that when the training and testing filters are matched, improvements in signal-to-noise ratio (SNR) of up to 7 dB can be obtained. However, when these filters differ, the improvement falls considerably and under some training conditions results in a lower SNR than the band-limited input. To circumvent this apparent overfitting to filter shape, we propose a data augmentation strategy which utilizes multiple low-pass filters during training and leads to improved generalization to unseen filtering conditions at test time.

引用

页码：132 / 142

页数：11

共 50 条

[1] ARTIFICIAL BANDWIDTH EXTENSION USING DEEP NEURAL NETWORKS FOR SPECTRAL ENVELOPE ESTIMATION
Abel, Johannes
Strake, Maximilian
Fingscheidt, Tim
[J]. 2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
[2] Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks
Gu, Yu
Ling, Zhen-Hua
Dai, Li-Rong
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 297 - 301
[3] Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
Lee, Bong-Ki
Noh, Kyounjin
Chang, Joon-Hyuk
Choo, Kihyun
Oh, Eunmi
[J]. IEEE ACCESS, 2018, 6 : 27039 - 27047
[4] BLIND BANDWIDTH EXTENSION BASED ON CONVOLUTIONAL AND RECURRENT DEEP NEURAL NETWORKS
Schmidt, Konstantin
Edler, Bernd
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5444 - 5448
[5] Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation
Abel, Johannes
Fingscheidt, Tim
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (01) : 71 - 83
[6] Audio bandwidth extension using ensemble of recurrent neural networks
Xin Liu
Chang-Chun Bao
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2016
[7] Audio bandwidth extension using ensemble of recurrent neural networks
Liu, Xin
Bao, Chang-Chun
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 12
[8] Music Genre Classification using Deep Neural Networks
Yimer, Mekonen Hiwot
Yu, Yongbin
Adu, Kwabena
Favour, Ekong
Liyih, Sinishaw Melikamu
Patamia, Rutherford Agbeshi
[J]. 2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2384 - 2391
[9] BEHM-GAN: Bandwidth Extension of Historical Music Using Generative Adversarial Networks
Moliner, Eloi
Valimaki, Vesa
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 943 - 956
[10] Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension
Lin Jiang
Ruimin Hu
Xiaochen Wang
Weiping Tu
Maosheng Zhang
[J]. China Communications, 2018, 15 (01) : 72 - 85

← 1 2 3 4 5 →