BLIND BANDWIDTH EXTENSION BASED ON CONVOLUTIONAL AND RECURRENT DEEP NEURAL NETWORKS

被引:0
|
作者
Schmidt, Konstantin [1 ]
Edler, Bernd [1 ]
机构
[1] Int Audio Labs Erlangen, Am Wolfsmantel 33, D-91058 Erlangen, Germany
关键词
Blind Bandwith Extension; Artificial Bandwidth Extension; Speech Coding; Audio Coding; DNN; regressive DNN; LSTM; CNN;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A blind bandwidth extension (BBWE) expands the bandwidth of telephone speech which often is limited to 0.2 to 3.4 kHz. The advantage is an increased perceived quality as well as an increased intelligibility. This work presents a BBWE similar to state-of-the-art bandwidth extensions like Intelligent Gap Filling with the difference that all processing is done in the decoder without the need of transmitting extra bits. Parameters like spectral envelope are estimated by a regressive Convolutional Deep Neuronal Network (CNN) with long short-term memory (LSTM). The system operates on frames of 20 ms without additional algorithmic delay and can be applied in state-of-the-art speech and audio codecs.
引用
收藏
页码:5444 / 5448
页数:5
相关论文
共 50 条
  • [1] Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension
    Lin Jiang
    Ruimin Hu
    Xiaochen Wang
    Weiping Tu
    Maosheng Zhang
    [J]. China Communications, 2018, 15 (01) : 72 - 85
  • [2] Nonlinear Prediction with Deep Recurrent Neural Networks for Non-Blind Audio Bandwidth Extension
    Jiang, Lin
    Hu, Ruimin
    Wang, Xiaochen
    Tu, Weiping
    Zhang, Maosheng
    [J]. CHINA COMMUNICATIONS, 2018, 15 (01) : 72 - 85
  • [3] Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks
    Gu, Yu
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 297 - 301
  • [4] Audio bandwidth extension using ensemble of recurrent neural networks
    Xin Liu
    Chang-Chun Bao
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2016
  • [5] Audio bandwidth extension using ensemble of recurrent neural networks
    Liu, Xin
    Bao, Chang-Chun
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 12
  • [6] A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-based Sparse Representation
    Liu, Bin
    Tao, Jianhua
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3778 - 3782
  • [7] Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
    Lee, Bong-Ki
    Noh, Kyounjin
    Chang, Joon-Hyuk
    Choo, Kihyun
    Oh, Eunmi
    [J]. IEEE ACCESS, 2018, 6 : 27039 - 27047
  • [8] Blind Super-Resolution with Deep Convolutional Neural Networks
    Peyrard, Clement
    Baccouche, Moez
    Garcia, Christophe
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 161 - 169
  • [9] Combining Very Deep Convolutional Neural Networks and Recurrent Neural Networks for Video Classification
    Kiziltepe, Rukiye Savran
    Gan, John Q.
    Escobar, Juan Jose
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT II, 2019, 11507 : 811 - 822
  • [10] Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features
    Sun, Xiao
    Li, Chengcheng
    Ren, Fuji
    [J]. NEUROCOMPUTING, 2016, 210 : 227 - 236