Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

被引:2
|
作者
Tao, Zhi [1 ,2 ]
Zhao, He-Ming [1 ]
Zhang, Xiao-Jun [2 ]
Wu, Di [1 ,2 ]
机构
[1] Soochow Univ, Sch Elect Informat, Suzhou 215006, Peoples R China
[2] Soochow Univ, Sch Phys Sci & Technol, Suzhou 215006, Peoples R China
关键词
speech enhancement; low SNR; auditory perception wavelet transform; unvoiced enhancement; masking effect;
D O I
10.2478/v10168-011-0037-5
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a speech enhancement method using the multi-scales and multi-thresholds of the auditory perception wavelet transform, which is suitable for a low SNR (signal to noise ratio) environment. This method achieves the goal of noise reduction according to the threshold processing of the human ear's auditory masking effect on the auditory perception wavelet transform parameters of a speech signal. At the same time, in order to prevent high frequency loss during the process of noise suppression, we first make a voicing decision based on the speech signals. Afterwards, we process the unvoiced sound segment and the voiced sound segment according to the different thresholds and different judgments. Lastly, we perform objective and subjective tests on the enhanced speech. The results show that, compared to other spectral subtractions, our method keeps the components of unvoiced sound intact, while it suppresses the residual noise and the background noise. Thus, the enhanced speech has better clarity and intelligibility.
引用
收藏
页码:519 / 532
页数:14
相关论文
共 50 条
  • [1] Multi-scales land-use image fusion method based on wavelet transform
    Zhang, Xinchang
    Fu, Yu
    Zhu, Jiamin
    Zhang, Wenjiang
    [J]. GEOINFORMATICS 2007: REMOTELY SENSED DATA AND INFORMATION, PTS 1 AND 2, 2007, 6752
  • [2] Prediction of chaotic time series based on kernel function and multi-scales wavelet transform
    Gao, Lan
    Hua, Qing
    Fu, Yixiang
    Zhou, Jinyong
    Song, Qingguo
    [J]. CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 4, PROCEEDINGS, 2008, : 311 - 316
  • [3] Tailored OFDM signal with multi-scales transform
    Huang, X
    Lu, J
    Zheng, J
    [J]. ELECTRONICS LETTERS, 2004, 40 (01) : 88 - 90
  • [4] Speech enhancement by noise driven adaptation of perceptual scales and thresholds of continuous wavelet transform coefficients
    Swami, Preety D.
    Sharma, Rupali
    Jain, Alok
    Swami, Dhirendra K.
    [J]. SPEECH COMMUNICATION, 2015, 70 : 1 - 12
  • [5] The Multi-scales Nonlinear Enhancement Method of THz Image
    Zhang Peng
    Hu Weiliang
    Luo Wenjian
    Zhang Zhihui
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON MICROWAVE TECHNOLOGY & COMPUTATIONAL ELECTROMAGNETICS (ICMTCE), 2013, : 341 - 344
  • [6] Image enhancement based on Multi-wavelet transform
    Wang, Xiu Bi
    Chen, Ming Ju
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON MODELLING AND SIMULATION (ICMS2009), VOL 3, 2009, : 124 - 127
  • [7] An Otsu multi-thresholds segmentation algorithm based on improved ACO
    Jun Qin
    Xuanjing Shen
    Fang Mei
    Zheng Fang
    [J]. The Journal of Supercomputing, 2019, 75 : 955 - 967
  • [8] An Otsu multi-thresholds segmentation algorithm based on improved ACO
    Qin, Jun
    Shen, Xuanjing
    Mei, Fang
    Fang, Zheng
    [J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (02): : 955 - 967
  • [9] Automatic Multi-thresholds Selection for Image Segmentation based on Evolutionary Approach
    Quoc Bao Truong
    Lee, Byung Ryong
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2013, 11 (04) : 834 - 844
  • [10] Multi-domain speech compression based on wavelet packet transform
    Wu, XD
    Li, YM
    Chen, HY
    [J]. ELECTRONICS LETTERS, 1998, 34 (02) : 154 - 155