SPEECH ENHANCEMENT BASED ON JOINT TIME-FREQUENCY SEGMENTATION

被引:2
|
作者
Tantibundhit, C. [1 ,2 ]
Pernkopf, F. [2 ]
Kubin, G. [2 ]
机构
[1] Thammasat Univ, Med Intelligence & Innovat Lab, Bangkok, Thailand
[2] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria
关键词
Speech enhancement; transient component; speech intelligibility; wavelet packet transform; SET;
D O I
10.1109/ICASSP.2009.4960673
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present an algorithm to decompose speech into transient and non-transient components. Our algorithm, the joint time-frequency segmentation algorithm, uses the wavelet packet coefficients of the speech signal and represents them as tiles of a time-frequency representation adapted to the characteristics of the signal itself. Any wavelet packet coefficient, whose tiling height is larger than or equal to the tiling width is characterized as a transient coefficient and vice versa for the non-transient coefficient. The transient component is selectively amplified and recombined with the original speech to generate the modified speech with energy adjusted to be equal to the energy of the original speech. The psychoacoustic tests performed with fourteen human listeners show that the speech modification significantly improves speech intelligibility in background noise, i.e., for 10% absolute at 0dB to 31% absolute at - 30dB.
引用
收藏
页码:4673 / +
页数:2
相关论文
共 50 条
  • [1] Joint Time-Frequency Segmentation Algorithm for Transient Speech Decomposition and Speech Enhancement
    Tantibundhit, Charturong
    Pernkopf, Franz
    Kubin, Gernot
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1417 - 1428
  • [2] Joint Time-Frequency and Time Domain Learning for Speech Enhancement
    Tang, Chuanxin
    Luo, Chong
    Zhao, Zhiyuan
    Xie, Wenxuan
    Zeng, Wenjun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3816 - 3822
  • [3] Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis
    Zhang, Wenbo
    Xie, Xuefeng
    Du, Yanling
    Huang, Dongmei
    [J]. Journal of the Acoustical Society of America, 1600, 155 (06): : 3580 - 3588
  • [4] Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis
    Zhang, Wenbo
    Xie, Xuefeng
    Du, Yanling
    Huang, Dongmei
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 155 (06): : 3580 - 3588
  • [5] Speech Feature Enhancement based on Time-frequency Analysis
    Do, Duc-Hao
    Chau, Thanh-Duc
    Tran, Thai-Son
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
  • [6] Noise estimation based on time-frequency correlation for speech enhancement
    Yuan, Wenhao
    Lin, Jiajun
    An, Wei
    Wang, Yu
    Chen, Ning
    [J]. APPLIED ACOUSTICS, 2013, 74 (05) : 770 - 781
  • [7] Segmentation on time-frequency domain for speech segregation
    Lim, Sung-Kil
    Lee, Hyon-Soo
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1 AND 2, 2006, : 433 - +
  • [8] Speech endpoint detection based on speech time-frequency enhancement and spectral entropy
    Fan Yingle
    Li Yi
    Wu Chuanyan
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4682 - 4684
  • [9] TIME-FREQUENCY ATTENTION FOR MONAURAL SPEECH ENHANCEMENT
    Zhang, Qiquan
    Song, Qi
    Ni, Zhaoheng
    Nicolson, Aaron
    Li, Haizhou
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7852 - 7856
  • [10] Joint Time-Frequency Segmentation for Transient Decomposition
    Tantibundhit, Charturong
    Kubin, Gernot
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2502 - +