Mixture linear prediction Gammatone Cepstral features for robust speaker verification under transmission channel noise

被引:4
|
作者
Krobba, Ahmed [1 ]
Debyeche, Mohamed [1 ]
Selouani, Sid-Ahmed [2 ]
机构
[1] USTHB, LCPTS, Algiers, Algeria
[2] Univ Moncton, LARIHS Lab, Campus Shappaing, Moncton, NB, Canada
关键词
Automatic speaker verification; Mixture linear prediction; Gammatone Frequency Cepstral Coefficients (GFCCs); I-vector GPLDA; Transmission channel noise; AUDITORY FILTER SHAPES; RECOGNITION; PERFORMANCE;
D O I
10.1007/s11042-020-08748-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a Mixture Linear Prediction based approach for robust Gammatone Cepstral Coefficients extraction (MLPGCCs). The proposed method provides performance improvement of Automatic Speaker Verification (ASV) using i-vector and Gaussian Probabilistic Linear Discriminant Analysis GPLDA modeling under transmission channel noise. The performance of the extracted MLPGCCs was evaluated using the NIST 2008 database where a single channel microphone recorded conversational speech. The system is analyzed in the presence of different channel transmission noises such as Additive White Gaussian (AWGN) and Rayleigh fading at various Signals to Noise Ratio (SNR) levels. The evaluation results show that the MLPGCCs features are a promising way for the ASV task. Indeed, the speaker verification performance using the MLPGCCs proposed features is significantly improved compared to the conventional Gammatone Frequency Cepstral Coefficients (GFCCs) and Mel Frequency Cepstral Coefficients (MFCCs) features. For speech signals corrupted with AWGN noise at SNRs ranging from (-5 dB to 15 dB), we obtain a significant reduction of the Equal Error Rate (EER) ranging from 9.41% to 6.65% and 3.72% to 1.50%, compared with conventional MFCCs and GFCCs features respectively. In addition, when the test speech signals are corrupted with Rayleigh fading channel we achieve an EER reduction ranging from 23.63% to 7.8% and from 10.88% to 6.8% compared with conventional MFCCs and GFCCs, respectively. We also found that the combination of GFCCs and MLPGCCs gives the highest performance of speaker verification system. The best performance combination achieved is around EER from 0.43% to 0.59% and 1.92% to 3.88%.
引用
收藏
页码:18679 / 18693
页数:15
相关论文
共 28 条
  • [1] Mixture linear prediction Gammatone Cepstral features for robust speaker verification under transmission channel noise
    Ahmed Krobba
    Mohamed Debyeche
    Sid-Ahmed Selouani
    [J]. Multimedia Tools and Applications, 2020, 79 : 18679 - 18693
  • [2] Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise
    Saeidi, Rahim
    Pohjalainen, Jouni
    Kinnunen, Tomi
    Alku, Paavo
    [J]. ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 40 - 46
  • [3] Mixture Linear Prediction in Speaker Verification Under Vocal Effort Mismatch
    Pohjalainen, Jouni
    Hanilci, Cemal
    Kinnunen, Tomi
    Alku, Paavo
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (12) : 1516 - 1520
  • [4] Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification
    Saeidi, Rahim
    Pohjalainen, Jouni
    Kinnunen, Tomi
    Alku, Paavo
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (06) : 599 - 602
  • [5] Consolidating Product Spectrum and Gammatone Filterbank for Robust Speaker Verification under noisy conditions
    Fedila, Meriem
    Bengherabi, Messaoud
    Amrouche, Abderrahmane
    [J]. 2015 15TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2015, : 347 - 352
  • [6] Robust Speaker Verification Under Additive Noise Condition
    Zhang, Er-Hua
    Wang, Ming-He
    Tang, Zhen-Min
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (06): : 1244 - 1250
  • [7] BOOSTED BINARY FEATURES FOR NOISE-ROBUST SPEAKER VERIFICATION
    Roy, Anindya
    Magimai-Doss, Mathew
    Marcel, Sebastien
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4442 - 4445
  • [8] Adversarial Network Bottleneck Features for Noise Robust Speaker Verification
    Yu, Hong
    Tan, Zheng-Hua
    Ma, Zhanyu
    Guo, Jun
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1492 - 1496
  • [9] Channel/handset mismatch evaluation in a biometric speaker verification using shifted delta cepstral features
    Calvo, Jose R.
    Fernandez, Rafael
    Hernandez, Gabriel
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 96 - 105
  • [10] Noise robust speaker verification using paralel model combination and local features
    Tüfekci, Z
    [J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 422 - 425