Speech Feature Enhancement based on Time-frequency Analysis

被引:0
|
作者
Do, Duc-Hao [1 ,2 ]
Chau, Thanh-Duc
Tran, Thai-Son
机构
[1] Vietnam Natl Univ, Univ Sci, Ho Chi Minh City, Vietnam
[2] FPT Univ, Ho Chi Minh City, Vietnam
关键词
Speech feature; multichannel representation; time-frequency analysis; Chirplet Transform; poly-linear chirplet transform; instantaneous frequency; REPRESENTATION; TRANSFORM;
D O I
10.1145/3605549
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time-frequency analysis (TFA) is a powerful method to exploit the hidden information of signals, including speech signals. Many techniques in this group were invented and developed to capture the most crucial stationary feature. However, human speech is not stable, and it contains some non-stationary elements. This work aims to design a new algorithm via the TFA technique to extract the trends and changes inside the speech signal in the time-frequency (TF) plane. We design a new algorithm to create a set of atoms for the signal transform, which can analyze the signal in many different view directions via Poly-Linear Chirplet Transform (PLCT). After processing the signal, the proposed method returns a multichannel output in which each channel results from a particular Linear Chirplet Transform (LCT). The feature then is combined with the MFCC feature to form the final representation. Although the size for speech representation rises, our extracted feature contains rich-meaning information to improve the recognition results compared to other features in gender recognition, dialect recognition, and speaker recognition.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] SPEECH ENHANCEMENT BASED ON JOINT TIME-FREQUENCY SEGMENTATION
    Tantibundhit, C.
    Pernkopf, F.
    Kubin, G.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4673 - +
  • [2] Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis
    Zhang, Wenbo
    Xie, Xuefeng
    Du, Yanling
    Huang, Dongmei
    [J]. Journal of the Acoustical Society of America, 1600, 155 (06): : 3580 - 3588
  • [3] Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis
    Zhang, Wenbo
    Xie, Xuefeng
    Du, Yanling
    Huang, Dongmei
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2024, 155 (06): : 3580 - 3588
  • [4] Noise estimation based on time-frequency correlation for speech enhancement
    Yuan, Wenhao
    Lin, Jiajun
    An, Wei
    Wang, Yu
    Chen, Ning
    [J]. APPLIED ACOUSTICS, 2013, 74 (05) : 770 - 781
  • [5] Speech endpoint detection based on speech time-frequency enhancement and spectral entropy
    Fan Yingle
    Li Yi
    Wu Chuanyan
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4682 - 4684
  • [6] TIME-FREQUENCY ATTENTION FOR MONAURAL SPEECH ENHANCEMENT
    Zhang, Qiquan
    Song, Qi
    Ni, Zhaoheng
    Nicolson, Aaron
    Li, Haizhou
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7852 - 7856
  • [7] Neural speech enhancement in the time-frequency domain
    Volkmer, M
    [J]. 2003 IEEE XIII WORKSHOP ON NEURAL NETWORKS FOR SIGNAL PROCESSING - NNSP'03, 2003, : 617 - 626
  • [8] Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation
    Kun-Ching Wang
    [J]. EURASIP Journal on Advances in Signal Processing, 2009
  • [9] Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation
    Wang, Kun-Ching
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
  • [10] Variance based time-frequency mask estimation for unsupervised speech enhancement
    Nasir Saleem
    Muhammad Irfan Khattak
    Gunawan Witjaksono
    Gulzar Ahmad
    [J]. Multimedia Tools and Applications, 2019, 78 : 31867 - 31891