Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for Speaker Recognition

被引:0
|
作者
Liang, Chunyan [1 ]
Zhang, Xiang [1 ]
Yang, Lin [1 ]
Zhang, Jianping [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, ThinkIt Speech Lab, Beijing, Peoples R China
关键词
speaker recognition; MVDR; PMCC; joint factor analysis;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Acoustic feature extraction from speech is a fundamental part in both automatic speech recognition and automatic speaker recognition. MeI-frequency cepstral coefficients (MFCCs) are widely used in both of the above two research directions. A new feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) has been demonstrated to perform superior in automatic speech recognition. Unlike the MFCCs in which a meI-scaled filterbank is applied to the short term FFT spectrum to obtain a perceptually meaningful smoothed gross spectrum. PMCCs use the Minimum Variance Distortion less Response (MVDR) all-pole model to represent the spectral envelope of the perceptual spectrum. In this study, we extract PMCCs and model them using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results indicate that the systems based on PMCCs can achieve comparable performance to those based on MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone.
引用
收藏
页码:1386 / 1389
页数:4
相关论文
共 50 条
  • [1] Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition
    LIANG Chunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong (Key Laboratory of Speech Acoustics and Content Understanding
    [J]. Chinese Journal of Acoustics, 2012, 31 (04) : 489 - 498
  • [2] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
    Yapanel, UH
    Dharanipragada, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
  • [3] Perceptual MVDR-based Unsupervised Built-in Speaker Normalization for Kazakh Speech Recognition
    Yessenbayev, Zhandos
    Yapanel, Umit
    [J]. 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), 2014, : 87 - 91
  • [4] A CEPSTRAL BASED SPEAKER RECOGNITION SYSTEM
    SETHURAMAN, R
    GOWDY, JN
    [J]. PROCEEDINGS : THE TWENTY-FIRST SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 1989, : 503 - 507
  • [5] Speaker Identification using Warped MVDR Cepstral Features
    Woelfel, Matthias
    Yang, Qian
    Jin, Qin
    Schultz, Tanja
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 904 - +
  • [6] Cascaded Feedforward Neural Networks for speaker identification using Perceptual Wavelet based Cepstral Coefficients
    Renisha, G.
    Jayasree, T.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 1141 - 1153
  • [7] Perceptual harmonic cepstral coefficients for speech recognition in noisy environment
    Gu, L
    Rose, K
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 125 - 128
  • [8] Perceptual Evaluation of Binaural MVDR-Based Algorithms to Preserve the Interaural Coherence of Diffuse Noise Fields
    Goessling, Nico
    Marquardt, Daniel
    Doclo, Simon
    [J]. TRENDS IN HEARING, 2020, 24
  • [9] Mel-Frequency Cepstral Coefficients as Features for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    [J]. 2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 419 - 424
  • [10] Mel Frequency Cepstral Coefficients Based Text Independent Automatic Speaker Recognition Using Matlab
    Singh, Amit Kumar
    Singh, Rohit
    Dwivedi, Ashutosh
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), 2014, : 524 - 527