Bandwidth extension of narrowband speech in log spectra domain using neural network

被引:0
|
作者
Pourmohammadi, Sara [1 ]
Vali, Mansour [2 ]
Ghadyani, Mohsen [1 ]
机构
[1] Univ Shahed, Fac Elect Engn, Tehran, Iran
[2] KN Toosi Univ Technol, Dept Elect & Comp Engn, Tehran, Iran
关键词
Bandwidth extension; log spectra domain; narrowband speech; neural network; wideband speech; RECOGNITION; RECONSTRUCTION; FREQUENCY;
D O I
10.3906/elk-1212-109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, there have been significant advances in communication technology, but speech signals still suffer from low perceived quality caused by bandwidth limitations of telephone networks. The bandwidth extension (BWE) approach adds high-frequency components of the speech signal to band-limited telephone speech and increases speech perception significantly. In this work, we develop a new method for representation of vocal tract filter coefficients using log of filter bank energy (LFBE) parameters as an alternative for mel-frequency cepstral coefficients (MFCCs). This approach is based on a strong correlation between the spectral components of low-and high-band spectrums. Furthermore, the performances of Gaussian mixture model and multilayer perceptron neural network methods for estimation of the high-frequency envelope are evaluated. Objective evaluations of the obtained results indicate that the LFBE feature vectors have better performance than the MFCCs. In addition, findings of the objective evaluations showed that using a neural network, which is not common in BWE, achieves a better performance as compared to the Gaussian mixture model.
引用
收藏
页码:433 / 446
页数:14
相关论文
共 50 条
  • [31] Mel-Frequency Cepstral Coefficient-Based Bandwidth Extension of Narrowband Speech
    Nour-Eldin, Amr H.
    Kabal, Peter
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 53 - 56
  • [32] Bandwidth extension of a narrowband speech coder for music streaming services over IP networks
    Lee, Young Han
    Kim, Hong Kook
    [J]. 2007 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, VOLS 1 AND 2, 2007, : 552 - 555
  • [33] COMBINING FRONTEND-BASED MEMORY WITH MFCC FEATURES FOR BANDWIDTH EXTENSION OF NARROWBAND SPEECH
    Nour-Eldin, Amr H.
    Kabal, Peter
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4001 - 4004
  • [34] Speech bandwidth extension method using speech recognition and speech synthesis
    Takashina, Masashi
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    [J]. 2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 1273 - +
  • [35] Artificial Bandwidth Extension for Speech Signals using Speech Recogniton
    Kuroiwa, Shingo
    Takashina, Masashi
    Tsuge, Satoru
    Fuji, Ren
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1045 - 1048
  • [36] Waveform Modeling Using Stacked Dilated Convolutional Neural Networks for Speech Bandwidth Extension
    Gu, Yu
    Ling, Zhen-Hua
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1123 - 1127
  • [37] Restoring High Frequency Spectral Envelopes Using Neural Networks for Speech Bandwidth Extension
    Gu, Yu
    Ling, Zhen-Hua
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [38] Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension
    Ling, Zhen-Hua
    Ai, Yang
    Gu, Yu
    Dai, Li-Rong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (05) : 883 - 894
  • [39] Sequential Deep Neural Networks Ensemble for Speech Bandwidth Extension
    Lee, Bong-Ki
    Noh, Kyounjin
    Chang, Joon-Hyuk
    Choo, Kihyun
    Oh, Eunmi
    [J]. IEEE ACCESS, 2018, 6 : 27039 - 27047
  • [40] Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech
    Nour-Elain, Amr H.
    Kabal, Peter
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1192 - 1195