Robust speaker identification system based on wavelet transform and Gaussian mixture model

被引:0
|
作者
Hsieh, CT [1 ]
Lai, E [1 ]
Wang, YC [1 ]
机构
[1] Tamkang Univ, Dept Elect Engn, Taipei 251, Taiwan
关键词
wavelet transform; linear predictive cepstral coefficients (LPCC); MAT (Mandarin Speech Across Taiwan); Gaussian mixture model (GMM); speaker identification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents an effective and robust method for extracting features for speech processing. Based on the time-frequency multiresolution property of wavelet transform, the input speech signal is decomposed into various frequency channels. For capturing the characteristics of the vocal track and vocal codes, the traditional linear predictive cepstral coefficients (LPCC) of the approximation channel, and the entropy of the detail channel for each decomposition process are calculated. In addition, a hard thresholding technique for each lower resolution is applied to remove interference from noise. Experimental results show that using this mechanism not only effectively reduces the influence of noise, but also improves recognition. Finally, the proposed feature extraction algorithm is evaluated on the MAT telephone speech database for text-independent speaker identification using the Gaussian Mixture Model (GMM) identifier. Some popular existing methods are also evaluated for comparison in this paper. The results show that the proposed method of feature extraction is more effective and robust than other methods. In addition, the performance of our method is very satisfactory even at low SNR.
引用
收藏
页码:267 / 282
页数:16
相关论文
共 50 条
  • [1] Robust speaker identification system based on wavelet transform and Gaussian mixture model
    Chen, WC
    Hsieh, CT
    Lai, E
    [J]. NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 263 - 271
  • [2] A robust speaker identification system based on wavelet transform
    Hsieh, CT
    Wang, YC
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (07): : 839 - 846
  • [3] Speaker identification research based on gaussian mixture model
    Chunguang, Han
    Hua, Li
    Jia, Ding
    [J]. 2007 International Symposium on Computer Science & Technology, Proceedings, 2007, : 702 - 705
  • [4] Robust speech features based on wavelet transform with application to speaker identification
    Hsieh, CT
    Lai, E
    Wang, YC
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (02): : 108 - 114
  • [5] Speaker Identification Wavelet Transform Based Method
    Daqrouq, Khaled
    Al-Sawalmeh, Wael
    Al-Qawasmi, Abdel-Rahman
    Abu-Isbeib, Ibrahim N.
    [J]. 2008 5TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS AND DEVICES, VOLS 1 AND 2, 2008, : 698 - 702
  • [6] An efficient scoring algorithm for Gaussian mixture model based speaker identification
    Pellom, BL
    Hansen, JHL
    [J]. IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (11) : 281 - 284
  • [7] Distributed genetic algorithm for Gaussian mixture model based speaker identification
    Lung, SY
    [J]. PATTERN RECOGNITION, 2003, 36 (10) : 2479 - 2481
  • [8] Optimization of Gaussian mixture model parameters for speaker identification
    Hong, QY
    Kwong, S
    Wang, HL
    [J]. GENETIC AND EVOLUTIONARY COMPUTATION GECCO 2004 , PT 2, PROCEEDINGS, 2004, 3103 : 1310 - 1311
  • [9] Individual dimension Gaussian mixture model for speaker identification
    Wang, C
    Hou, LM
    Fang, Y
    [J]. ADVANCES IN BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2005, 3781 : 172 - 179
  • [10] Speaker identification using hybrid Karhunen-Loeve transform and Gaussian mixture model approach
    Chen, CCT
    Chen, CT
    Hou, CK
    [J]. PATTERN RECOGNITION, 2004, 37 (05) : 1073 - 1075