Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks

被引:32
|
作者
Gu, Yu [1 ]
Ling, Zhen-Hua [1 ]
Dai, Li-Rong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China
关键词
speech bandwidth extension; deep neural networks; recurrent neural networks; long short-term memory; bottleneck features;
D O I
10.21437/Interspeech.2016-678
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for speech bandwidth extension (BWE) using deep structured neural networks. In order to utilize linguistic information during the prediction of high-frequency spectral components, the bottleneck (BN) features derived from a deep neural network (DNN)-based state classifier for narrowband speech are employed as auxiliary input. Furthermore, recurrent neural networks (RNNs) incorporating long short-term memory (LSTM) cells are adopted to model the complex mapping relationship between the feature sequences describing low-frequency and high-frequency spectra. Experimental results show that the BWE method proposed in this paper can achieve better performance than the conventional method based on Gaussian mixture models (GMMs) and the state-of-the-art approach based on DNNs in both objective and subjective tests.
引用
收藏
页码:297 / 301
页数:5
相关论文
共 50 条
  • [1] Efficient deep neural networks for speech synthesis using bottleneck features
    Joo, Young-Sun
    Jun, Won-Suk
    Kang, Hong-Goo
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [2] Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
    Lee, Bong-Ki
    Noh, Kyounjin
    Chang, Joon-Hyuk
    Choo, Kihyun
    Oh, Eunmi
    [J]. IEEE ACCESS, 2018, 6 : 27039 - 27047
  • [3] Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension
    Ling, Zhen-Hua
    Ai, Yang
    Gu, Yu
    Dai, Li-Rong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (05) : 883 - 894
  • [4] BLIND BANDWIDTH EXTENSION BASED ON CONVOLUTIONAL AND RECURRENT DEEP NEURAL NETWORKS
    Schmidt, Konstantin
    Edler, Bernd
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5444 - 5448
  • [5] Audio bandwidth extension using ensemble of recurrent neural networks
    Xin Liu
    Chang-Chun Bao
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2016
  • [6] Audio bandwidth extension using ensemble of recurrent neural networks
    Liu, Xin
    Bao, Chang-Chun
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 12
  • [7] Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation
    Abel, Johannes
    Fingscheidt, Tim
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (01) : 71 - 83
  • [8] Improved Bottleneck Features Using Pretrained Deep Neural Networks
    Yu, Dong
    Seltzer, Michael L.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 244 - 247
  • [9] Emotion recognition from speech using deep recurrent neural networks with acoustic features
    Byun, Sung-Woo
    Shin, Bo-Ra
    Lee, Seok-Pil
    Han, Hyuk-Soo
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 43 - 44
  • [10] Mapping Neural Networks for Bandwidth Extension of Narrowband Speech
    Shahina, A.
    Yegnanarayana, B.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1435 - 1438