Bandwidth extension of narrowband speech in log spectra domain using neural network

被引:0
|
作者
Pourmohammadi, Sara [1 ]
Vali, Mansour [2 ]
Ghadyani, Mohsen [1 ]
机构
[1] Univ Shahed, Fac Elect Engn, Tehran, Iran
[2] KN Toosi Univ Technol, Dept Elect & Comp Engn, Tehran, Iran
关键词
Bandwidth extension; log spectra domain; narrowband speech; neural network; wideband speech; RECOGNITION; RECONSTRUCTION; FREQUENCY;
D O I
10.3906/elk-1212-109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, there have been significant advances in communication technology, but speech signals still suffer from low perceived quality caused by bandwidth limitations of telephone networks. The bandwidth extension (BWE) approach adds high-frequency components of the speech signal to band-limited telephone speech and increases speech perception significantly. In this work, we develop a new method for representation of vocal tract filter coefficients using log of filter bank energy (LFBE) parameters as an alternative for mel-frequency cepstral coefficients (MFCCs). This approach is based on a strong correlation between the spectral components of low-and high-band spectrums. Furthermore, the performances of Gaussian mixture model and multilayer perceptron neural network methods for estimation of the high-frequency envelope are evaluated. Objective evaluations of the obtained results indicate that the LFBE feature vectors have better performance than the MFCCs. In addition, findings of the objective evaluations showed that using a neural network, which is not common in BWE, achieves a better performance as compared to the Gaussian mixture model.
引用
收藏
页码:433 / 446
页数:14
相关论文
共 50 条
  • [1] Mapping Neural Networks for Bandwidth Extension of Narrowband Speech
    Shahina, A.
    Yegnanarayana, B.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1435 - 1438
  • [2] TIME-DOMAIN NEURAL NETWORK APPROACH FOR SPEECH BANDWIDTH EXTENSION
    Hao, Xiang
    Xu, Chenglin
    Hou, Nana
    Xie, Lei
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 866 - 870
  • [3] Bandwidth extension of narrowband speech using cepstral analysis
    Soon, IY
    Yeo, CK
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 242 - 245
  • [4] Bandwidth extension of narrowband speech using integer wavelet transform
    Nizampatnam, Prasad
    Tappeta, Kishore Kumar
    [J]. IET SIGNAL PROCESSING, 2017, 11 (04) : 437 - 445
  • [5] Narrowband Speech Signal Bandwidth Extension for Intelligible Speech Communication
    Ganesh, Mirishkar Sai
    Patnaik, Bijayananda
    Karthik, M. L. N. S.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TECHNIQUES IN CONTROL, OPTIMIZATION AND SIGNAL PROCESSING (INCOS), 2017,
  • [6] Combining equalization and estimation for bandwidth extension of narrowband speech
    Qian, YS
    Kabal, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 713 - 716
  • [7] RECURRENT NEURAL NETWORK FOR SPECTRAL MAPPING IN SPEECH BANDWIDTH EXTENSION
    Wang, Yingxue
    Zhao, Shenghui
    Li, Jianxin
    Kuang, Jingming
    Zhu, Qiang
    [J]. 2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 242 - 246
  • [8] Block-based bandwidth extension of narrowband speech signal by using CDHMM
    Yao, S
    Chan, CF
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 793 - 796
  • [9] Robust Speech Recognition Using MLP Neural Network in Log-Spectral Domain
    Ghaemmaghami, Masoumeh P.
    Sameti, Hossein
    Razzazi, Farbod
    BabaAli, Bagher
    Dabbaghchian, Saeed
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2009), 2009, : 467 - +
  • [10] Bandwidth Extension of Narrowband Speech Based on Hidden Markov Model
    Yong, Zhang
    Yi, Liu
    [J]. 2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 372 - 376