SPEECH ENHANCEMENT BASED ON JOINT TIME-FREQUENCY SEGMENTATION

被引：2

作者：

Tantibundhit, C. ^{[1
,2
]}

Pernkopf, F. ^{[2
]}

Kubin, G. ^{[2
]}

机构：

[1] Thammasat Univ, Med Intelligence & Innovat Lab, Bangkok, Thailand

[2] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年

关键词：

Speech enhancement; transient component; speech intelligibility; wavelet packet transform; SET;

D O I：

10.1109/ICASSP.2009.4960673

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present an algorithm to decompose speech into transient and non-transient components. Our algorithm, the joint time-frequency segmentation algorithm, uses the wavelet packet coefficients of the speech signal and represents them as tiles of a time-frequency representation adapted to the characteristics of the signal itself. Any wavelet packet coefficient, whose tiling height is larger than or equal to the tiling width is characterized as a transient coefficient and vice versa for the non-transient coefficient. The transient component is selectively amplified and recombined with the original speech to generate the modified speech with energy adjusted to be equal to the energy of the original speech. The psychoacoustic tests performed with fourteen human listeners show that the speech modification significantly improves speech intelligibility in background noise, i.e., for 10% absolute at 0dB to 31% absolute at - 30dB.

引用

页码：4673 / +

页数：2

共 50 条

[41] Joint time-frequency analysis
Qian, Shie
Chen, Dapang
[J]. IEEE Signal Processing Magazine, 1999, 16 (02): : 52 - 67
[42] Joint Time-Frequency Scattering
Anden, Joakim
Lostanlen, Vincent
Mallat, Stephane
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (14) : 3704 - 3718
[43] Speech Enhancement in Low SNR Environments by Designing a Time-Frequency Binary Mask
Cheng, Shuai
Zhang, Haijian
Hua, Guang
[J]. 2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[44] TIME-FREQUENCY CONSTRAINTS FOR PHASE ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT
Mowlaee, Pejman
Saeidi, Rahim
[J]. 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 337 - 341
[45] Speech Enhancement for Pathological Voice Using Time-Frequency Trajectory Excitation Modeling
Song, Eunwoo
Ryu, Jongyoub
Kang, Hong-Goo
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[46] Seismic random noise attenuation by time-frequency peak filtering based on joint time-frequency distribution
Zhang, Chao
Lin, Hong-bo
Li, Yue
Yang, Bao-jun
[J]. COMPTES RENDUS GEOSCIENCE, 2013, 345 (9-10) : 383 - 391
[47] A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking
Huang, Jianjun
Zhang, Yafei
Zhang, Xiongwei
Zhu, Tao
[J]. PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (07): : 225 - 229
[48] Noisy-reverberant Speech Enhancement Using DenseUNet with Time-frequency Attention
Zhao, Yan
Wang, DeLiang
[J]. INTERSPEECH 2020, 2020, : 3261 - 3265
[49] A Phase-Based Time-Frequency masking for multi-channel speech enhancement in domestic environments
Brutti, Alessio
Tsiami, Antigoni
Katsamanis, Athanasios
Maragos, Petros
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2875 - 2879
[50] Time-frequency masking based supervised speech enhancement framework using fuzzy deep belief network
Samui, Suman
Chakrabarti, Indrajit
Ghosh, Soumya K.
[J]. APPLIED SOFT COMPUTING, 2019, 74 : 583 - 602

← 1 2 3 4 5 →