Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks

被引：32

作者：

Gu, Yu ^{[1
]}

Ling, Zhen-Hua ^{[1
]}

Dai, Li-Rong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

speech bandwidth extension; deep neural networks; recurrent neural networks; long short-term memory; bottleneck features;

D O I：

10.21437/Interspeech.2016-678

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a novel method for speech bandwidth extension (BWE) using deep structured neural networks. In order to utilize linguistic information during the prediction of high-frequency spectral components, the bottleneck (BN) features derived from a deep neural network (DNN)-based state classifier for narrowband speech are employed as auxiliary input. Furthermore, recurrent neural networks (RNNs) incorporating long short-term memory (LSTM) cells are adopted to model the complex mapping relationship between the feature sequences describing low-frequency and high-frequency spectra. Experimental results show that the BWE method proposed in this paper can achieve better performance than the conventional method based on Gaussian mixture models (GMMs) and the state-of-the-art approach based on DNNs in both objective and subjective tests.

引用

页码：297 / 301

页数：5

共 50 条

[31] Combining Speech Features for Aggression Detection Using Deep Neural Networks
Jaafar, Noussaiba
Lachiri, Zied
[J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,
[32] Decoding Imagined Speech using Wavelet Features and Deep Neural Networks
Panachakel, Jerrin Thomas
Ramakrishnan, A. G.
Ananthapadmanabha, T., V
[J]. 2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
[33] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
Weng, Chao
Yu, Dong
Watanabe, Shinji
Juang, Biing-Hwang
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[34] Investigation of Bottleneck Features and Multilingual Deep Neural Networks for Speaker Verification
Tian, Yao
Cai, Meng
He, Liang
Liu, Jia
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1151 - 1155
[35] Speech Bandwidth Extension Using Recurrent Temporal Restricted Boltzmann Machines
Wang, Yingxue
Zhao, Shenghui
Li, Jianxin
Kuang, Jingming
[J]. IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (12) : 1877 - 1881
[36] DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION
Liu, Zhaoci
Guo, Zhiqiang
Ling, Zhenhua
Li, Yunxia
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7323 - 7327
[37] Speech prediction using recurrent neural networks
Varoglu, E
Hacioglu, K
[J]. ELECTRONICS LETTERS, 1999, 35 (16) : 1353 - 1355
[38] Mood Disorder Identification Using Deep Bottleneck Features of Elicited Speech
Huang, Kun-Yi
Wu, Chung-Hsien
Su, Ming-Hsiang
Chou, Chia-Hui
[J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1648 - 1652
[39] Modulation spectral features for speech emotion recognition using deep neural networks
Singh, Premjeet
Sahidullah, Md
Saha, Goutam
[J]. SPEECH COMMUNICATION, 2023, 146 : 53 - 69
[40] Regularized sparse features for noisy speech enhancement using deep neural networks
Khattak, Muhammad Irfan
Saleem, Nasir
Gao, Jiechao
Verdu, Elena
Fuente, Javier Parra
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100

← 1 2 3 4 5 →