Perceptual MVDR-Based Cepstral Coefficients (PMCCs) for Speaker Recognition

被引：0

作者：

Liang, Chunyan ^{[1
]}

Zhang, Xiang ^{[1
]}

Yang, Lin ^{[1
]}

Zhang, Jianping ^{[1
]}

Yan, Yonghong ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Acoust, ThinkIt Speech Lab, Beijing, Peoples R China

来源：

2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III | 2010年

关键词：

speaker recognition; MVDR; PMCC; joint factor analysis;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Acoustic feature extraction from speech is a fundamental part in both automatic speech recognition and automatic speaker recognition. MeI-frequency cepstral coefficients (MFCCs) are widely used in both of the above two research directions. A new feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) has been demonstrated to perform superior in automatic speech recognition. Unlike the MFCCs in which a meI-scaled filterbank is applied to the short term FFT spectrum to obtain a perceptually meaningful smoothed gross spectrum. PMCCs use the Minimum Variance Distortion less Response (MVDR) all-pole model to represent the spectral envelope of the perceptual spectrum. In this study, we extract PMCCs and model them using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results indicate that the systems based on PMCCs can achieve comparable performance to those based on MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone.

引用

页码：1386 / 1389

页数：4

共 50 条

[1] Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition
LIANG Chunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong (Key Laboratory of Speech Acoustics and Content Understanding
[J]. Chinese Journal of Acoustics, 2012, 31 (04) : 489 - 498
[2] Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition
Yapanel, UH
Dharanipragada, S
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 644 - 647
[3] Perceptual MVDR-based Unsupervised Built-in Speaker Normalization for Kazakh Speech Recognition
Yessenbayev, Zhandos
Yapanel, Umit
[J]. 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), 2014, : 87 - 91
[4] A CEPSTRAL BASED SPEAKER RECOGNITION SYSTEM
SETHURAMAN, R
GOWDY, JN
[J]. PROCEEDINGS : THE TWENTY-FIRST SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 1989, : 503 - 507
[5] Speaker Identification using Warped MVDR Cepstral Features
Woelfel, Matthias
Yang, Qian
Jin, Qin
Schultz, Tanja
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 904 - +
[6] Cascaded Feedforward Neural Networks for speaker identification using Perceptual Wavelet based Cepstral Coefficients
Renisha, G.
Jayasree, T.
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 1141 - 1153
[7] Perceptual harmonic cepstral coefficients for speech recognition in noisy environment
Gu, L
Rose, K
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 125 - 128
[8] Perceptual Evaluation of Binaural MVDR-Based Algorithms to Preserve the Interaural Coherence of Diffuse Noise Fields
Goessling, Nico
Marquardt, Daniel
Doclo, Simon
[J]. TRENDS IN HEARING, 2020, 24
[9] Mel-Frequency Cepstral Coefficients as Features for Automatic Speaker Recognition
Jokic, Ivan D.
Jokic, Stevan D.
Delic, Vlado D.
Peric, Zoran H.
[J]. 2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 419 - 424
[10] Mel Frequency Cepstral Coefficients Based Text Independent Automatic Speaker Recognition Using Matlab
Singh, Amit Kumar
Singh, Rohit
Dwivedi, Ashutosh
[J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), 2014, : 524 - 527

← 1 2 3 4 5 →