Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

被引：2

作者：

Tao, Zhi ^{[1
,2
]}

Zhao, He-Ming ^{[1
]}

Zhang, Xiao-Jun ^{[2
]}

Wu, Di ^{[1
,2
]}

机构：

[1] Soochow Univ, Sch Elect Informat, Suzhou 215006, Peoples R China

[2] Soochow Univ, Sch Phys Sci & Technol, Suzhou 215006, Peoples R China

来源：

ARCHIVES OF ACOUSTICS | 2011年 / 36卷 / 03期

关键词：

speech enhancement; low SNR; auditory perception wavelet transform; unvoiced enhancement; masking effect;

D O I：

10.2478/v10168-011-0037-5

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes a speech enhancement method using the multi-scales and multi-thresholds of the auditory perception wavelet transform, which is suitable for a low SNR (signal to noise ratio) environment. This method achieves the goal of noise reduction according to the threshold processing of the human ear's auditory masking effect on the auditory perception wavelet transform parameters of a speech signal. At the same time, in order to prevent high frequency loss during the process of noise suppression, we first make a voicing decision based on the speech signals. Afterwards, we process the unvoiced sound segment and the voiced sound segment according to the different thresholds and different judgments. Lastly, we perform objective and subjective tests on the enhanced speech. The results show that, compared to other spectral subtractions, our method keeps the components of unvoiced sound intact, while it suppresses the residual noise and the background noise. Thus, the enhanced speech has better clarity and intelligibility.

引用

页码：519 / 532

页数：14

共 50 条

[1] Multi-scales land-use image fusion method based on wavelet transform
Zhang, Xinchang
Fu, Yu
Zhu, Jiamin
Zhang, Wenjiang
[J]. GEOINFORMATICS 2007: REMOTELY SENSED DATA AND INFORMATION, PTS 1 AND 2, 2007, 6752
[2] Prediction of chaotic time series based on kernel function and multi-scales wavelet transform
Gao, Lan
Hua, Qing
Fu, Yixiang
Zhou, Jinyong
Song, Qingguo
[J]. CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 4, PROCEEDINGS, 2008, : 311 - 316
[3] Tailored OFDM signal with multi-scales transform
Huang, X
Lu, J
Zheng, J
[J]. ELECTRONICS LETTERS, 2004, 40 (01) : 88 - 90
[4] Speech enhancement by noise driven adaptation of perceptual scales and thresholds of continuous wavelet transform coefficients
Swami, Preety D.
Sharma, Rupali
Jain, Alok
Swami, Dhirendra K.
[J]. SPEECH COMMUNICATION, 2015, 70 : 1 - 12
[5] The Multi-scales Nonlinear Enhancement Method of THz Image
Zhang Peng
Hu Weiliang
Luo Wenjian
Zhang Zhihui
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON MICROWAVE TECHNOLOGY & COMPUTATIONAL ELECTROMAGNETICS (ICMTCE), 2013, : 341 - 344
[6] Image enhancement based on Multi-wavelet transform
Wang, Xiu Bi
Chen, Ming Ju
[J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON MODELLING AND SIMULATION (ICMS2009), VOL 3, 2009, : 124 - 127
[7] An Otsu multi-thresholds segmentation algorithm based on improved ACO
Jun Qin
Xuanjing Shen
Fang Mei
Zheng Fang
[J]. The Journal of Supercomputing, 2019, 75 : 955 - 967
[8] An Otsu multi-thresholds segmentation algorithm based on improved ACO
Qin, Jun
Shen, Xuanjing
Mei, Fang
Fang, Zheng
[J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (02): : 955 - 967
[9] Automatic Multi-thresholds Selection for Image Segmentation based on Evolutionary Approach
Quoc Bao Truong
Lee, Byung Ryong
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2013, 11 (04) : 834 - 844
[10] Multi-domain speech compression based on wavelet packet transform
Wu, XD
Li, YM
Chen, HY
[J]. ELECTRONICS LETTERS, 1998, 34 (02) : 154 - 155

← 1 2 3 4 5 →