Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks

被引:32
|
作者
Gu, Yu [1 ]
Ling, Zhen-Hua [1 ]
Dai, Li-Rong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China
关键词
speech bandwidth extension; deep neural networks; recurrent neural networks; long short-term memory; bottleneck features;
D O I
10.21437/Interspeech.2016-678
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for speech bandwidth extension (BWE) using deep structured neural networks. In order to utilize linguistic information during the prediction of high-frequency spectral components, the bottleneck (BN) features derived from a deep neural network (DNN)-based state classifier for narrowband speech are employed as auxiliary input. Furthermore, recurrent neural networks (RNNs) incorporating long short-term memory (LSTM) cells are adopted to model the complex mapping relationship between the feature sequences describing low-frequency and high-frequency spectra. Experimental results show that the BWE method proposed in this paper can achieve better performance than the conventional method based on Gaussian mixture models (GMMs) and the state-of-the-art approach based on DNNs in both objective and subjective tests.
引用
收藏
页码:297 / 301
页数:5
相关论文
共 50 条
  • [31] Combining Speech Features for Aggression Detection Using Deep Neural Networks
    Jaafar, Noussaiba
    Lachiri, Zied
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP'2020), 2020,
  • [32] Decoding Imagined Speech using Wavelet Features and Deep Neural Networks
    Panachakel, Jerrin Thomas
    Ramakrishnan, A. G.
    Ananthapadmanabha, T., V
    [J]. 2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
  • [33] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Weng, Chao
    Yu, Dong
    Watanabe, Shinji
    Juang, Biing-Hwang
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [34] Investigation of Bottleneck Features and Multilingual Deep Neural Networks for Speaker Verification
    Tian, Yao
    Cai, Meng
    He, Liang
    Liu, Jia
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1151 - 1155
  • [35] Speech Bandwidth Extension Using Recurrent Temporal Restricted Boltzmann Machines
    Wang, Yingxue
    Zhao, Shenghui
    Li, Jianxin
    Kuang, Jingming
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (12) : 1877 - 1881
  • [36] DETECTING ALZHEIMER'S DISEASE FROM SPEECH USING NEURAL NETWORKS WITH BOTTLENECK FEATURES AND DATA AUGMENTATION
    Liu, Zhaoci
    Guo, Zhiqiang
    Ling, Zhenhua
    Li, Yunxia
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7323 - 7327
  • [37] Speech prediction using recurrent neural networks
    Varoglu, E
    Hacioglu, K
    [J]. ELECTRONICS LETTERS, 1999, 35 (16) : 1353 - 1355
  • [38] Mood Disorder Identification Using Deep Bottleneck Features of Elicited Speech
    Huang, Kun-Yi
    Wu, Chung-Hsien
    Su, Ming-Hsiang
    Chou, Chia-Hui
    [J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1648 - 1652
  • [39] Modulation spectral features for speech emotion recognition using deep neural networks
    Singh, Premjeet
    Sahidullah, Md
    Saha, Goutam
    [J]. SPEECH COMMUNICATION, 2023, 146 : 53 - 69
  • [40] Regularized sparse features for noisy speech enhancement using deep neural networks
    Khattak, Muhammad Irfan
    Saleem, Nasir
    Gao, Jiechao
    Verdu, Elena
    Fuente, Javier Parra
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100