Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

被引：2

作者：

Tao, Zhi ^{[1
,2
]}

Zhao, He-Ming ^{[1
]}

Zhang, Xiao-Jun ^{[2
]}

Wu, Di ^{[1
,2
]}

机构：

[1] Soochow Univ, Sch Elect Informat, Suzhou 215006, Peoples R China

[2] Soochow Univ, Sch Phys Sci & Technol, Suzhou 215006, Peoples R China

来源：

ARCHIVES OF ACOUSTICS | 2011年 / 36卷 / 03期

关键词：

speech enhancement; low SNR; auditory perception wavelet transform; unvoiced enhancement; masking effect;

D O I：

10.2478/v10168-011-0037-5

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes a speech enhancement method using the multi-scales and multi-thresholds of the auditory perception wavelet transform, which is suitable for a low SNR (signal to noise ratio) environment. This method achieves the goal of noise reduction according to the threshold processing of the human ear's auditory masking effect on the auditory perception wavelet transform parameters of a speech signal. At the same time, in order to prevent high frequency loss during the process of noise suppression, we first make a voicing decision based on the speech signals. Afterwards, we process the unvoiced sound segment and the voiced sound segment according to the different thresholds and different judgments. Lastly, we perform objective and subjective tests on the enhanced speech. The results show that, compared to other spectral subtractions, our method keeps the components of unvoiced sound intact, while it suppresses the residual noise and the background noise. Thus, the enhanced speech has better clarity and intelligibility.

引用

页码：519 / 532

页数：14

共 50 条

[41] Image retrieval based on multi-wavelet transform
Xi, Wu
Tong, Zhu
[J]. CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 2, PROCEEDINGS, 2008, : 510 - +
[42] Image fusion based on multi-wavelet transform
Tang, Guoliang
Pu, Jiexin
Huang, Xinhan
[J]. IEEE ICMA 2006: PROCEEDING OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-3, PROCEEDINGS, 2006, : 2058 - +
[43] Speech enhancement based on the discrete Gabor transform and multi-notch adaptive digital filters
Erçelebi, E
[J]. APPLIED ACOUSTICS, 2004, 65 (08) : 739 - 762
[44] Research on speech enhancement based on the simulation of auditory model using frame-synchronized combined wavelet packet transform algorithms
Zhu, Xuewen
Yang, Daochun
Wang, Wei
Mou, Feng
Xu, Boling
[J]. Shengxue Xuebao/Acta Acustica, 2003, 28 (01): : 12 - 16
[45] Visual perception based different scale remote sensing images fusion with multi-wavelet transform
Na, Yan
Ehlers, Manfred
Yang, Wanhai
[J]. REMOTE SENSING FOR ENVIRONMENTAL MONITORING, GIS APPLICATIONS AND GEOLOGY VI, 2006, 6366
[46] Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator
Kuwalek, Piotr
Jesko, Waldemar
[J]. ELECTRONICS, 2023, 12 (14)
[47] Auditory spatial cuing for speech perception in a dynamic multi-talker environment
Tomoriova, Beata
Kopco, Norbert
[J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, 2008, : 230 - 233
[48] Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception
O'Sullivan, James
Herrero, Jose
Smith, Elliot
Schevon, Catherine
McKhann, Guy M.
Sheth, Sameer A.
Mehta, Ashesh D.
Mesgarani, Nima
[J]. NEURON, 2019, 104 (06) : 1195 - +
[49] Analysis of rheological properties and micro-mechanism of aged and reclaimed asphalt based on multi-scales
Lin, Mei
Shuai, Jun
Li, Ping
Kang, Xiao
Lei, Yu
[J]. CONSTRUCTION AND BUILDING MATERIALS, 2022, 321
[50] The evolution of spillover effects between oil and stock markets across multi-scales using a wavelet-based GARCH-BEKK model
Liu, Xueyong
An, Haizhong
Huang, Shupei
Wen, Shaobo
[J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2017, 465 : 374 - 383

← 1 2 3 4 5 →