SPEECH ENHANCEMENT BASED ON JOINT TIME-FREQUENCY SEGMENTATION

被引:2
|
作者
Tantibundhit, C. [1 ,2 ]
Pernkopf, F. [2 ]
Kubin, G. [2 ]
机构
[1] Thammasat Univ, Med Intelligence & Innovat Lab, Bangkok, Thailand
[2] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria
关键词
Speech enhancement; transient component; speech intelligibility; wavelet packet transform; SET;
D O I
10.1109/ICASSP.2009.4960673
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present an algorithm to decompose speech into transient and non-transient components. Our algorithm, the joint time-frequency segmentation algorithm, uses the wavelet packet coefficients of the speech signal and represents them as tiles of a time-frequency representation adapted to the characteristics of the signal itself. Any wavelet packet coefficient, whose tiling height is larger than or equal to the tiling width is characterized as a transient coefficient and vice versa for the non-transient coefficient. The transient component is selectively amplified and recombined with the original speech to generate the modified speech with energy adjusted to be equal to the energy of the original speech. The psychoacoustic tests performed with fourteen human listeners show that the speech modification significantly improves speech intelligibility in background noise, i.e., for 10% absolute at 0dB to 31% absolute at - 30dB.
引用
收藏
页码:4673 / +
页数:2
相关论文
共 50 条
  • [41] Joint time-frequency analysis
    Qian, Shie
    Chen, Dapang
    [J]. IEEE Signal Processing Magazine, 1999, 16 (02): : 52 - 67
  • [42] Joint Time-Frequency Scattering
    Anden, Joakim
    Lostanlen, Vincent
    Mallat, Stephane
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (14) : 3704 - 3718
  • [43] Speech Enhancement in Low SNR Environments by Designing a Time-Frequency Binary Mask
    Cheng, Shuai
    Zhang, Haijian
    Hua, Guang
    [J]. 2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [44] TIME-FREQUENCY CONSTRAINTS FOR PHASE ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT
    Mowlaee, Pejman
    Saeidi, Rahim
    [J]. 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 337 - 341
  • [45] Speech Enhancement for Pathological Voice Using Time-Frequency Trajectory Excitation Modeling
    Song, Eunwoo
    Ryu, Jongyoub
    Kang, Hong-Goo
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [46] Seismic random noise attenuation by time-frequency peak filtering based on joint time-frequency distribution
    Zhang, Chao
    Lin, Hong-bo
    Li, Yue
    Yang, Bao-jun
    [J]. COMPTES RENDUS GEOSCIENCE, 2013, 345 (9-10) : 383 - 391
  • [47] A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking
    Huang, Jianjun
    Zhang, Yafei
    Zhang, Xiongwei
    Zhu, Tao
    [J]. PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (07): : 225 - 229
  • [48] Noisy-reverberant Speech Enhancement Using DenseUNet with Time-frequency Attention
    Zhao, Yan
    Wang, DeLiang
    [J]. INTERSPEECH 2020, 2020, : 3261 - 3265
  • [49] A Phase-Based Time-Frequency masking for multi-channel speech enhancement in domestic environments
    Brutti, Alessio
    Tsiami, Antigoni
    Katsamanis, Athanasios
    Maragos, Petros
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2875 - 2879
  • [50] Time-frequency masking based supervised speech enhancement framework using fuzzy deep belief network
    Samui, Suman
    Chakrabarti, Indrajit
    Ghosh, Soumya K.
    [J]. APPLIED SOFT COMPUTING, 2019, 74 : 583 - 602