Enhancing Speaker Recognition Models with Noise-Resilient Feature Optimization Strategies

被引:2
|
作者
Chauhan, Neha [1 ]
Isshiki, Tsuyoshi [1 ]
Li, Dongju [1 ]
机构
[1] Tokyo Inst Technol, Dept Informat & Commun Engn, Tokyo 1528550, Japan
来源
ACOUSTICS | 2024年 / 6卷 / 02期
关键词
speaker identification; speaker verification; feature-level fusion; dimension reduction; feature optimization; PCA;
D O I
10.3390/acoustics6020024
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper delves into an in-depth exploration of speaker recognition methodologies, with a primary focus on three pivotal approaches: feature-level fusion, dimension reduction employing principal component analysis (PCA) and independent component analysis (ICA), and feature optimization through a genetic algorithm (GA) and the marine predator algorithm (MPA). This study conducts comprehensive experiments across diverse speech datasets characterized by varying noise levels and speaker counts. Impressively, the research yields exceptional results across different datasets and classifiers. For instance, on the TIMIT babble noise dataset (120 speakers), feature fusion achieves a remarkable speaker identification accuracy of 92.7%, while various feature optimization techniques combined with K nearest neighbor (KNN) and linear discriminant (LD) classifiers result in a speaker verification equal error rate (SV EER) of 0.7%. Notably, this study achieves a speaker identification accuracy of 93.5% and SV EER of 0.13% on the TIMIT babble noise dataset (630 speakers) using a KNN classifier with feature optimization. On the TIMIT white noise dataset (120 and 630 speakers), speaker identification accuracies of 93.3% and 83.5%, along with SV EER values of 0.58% and 0.13%, respectively, were attained utilizing PCA dimension reduction and feature optimization techniques (PCA-MPA) with KNN classifiers. Furthermore, on the voxceleb1 dataset, PCA-MPA feature optimization with KNN classifiers achieves a speaker identification accuracy of 95.2% and an SV EER of 1.8%. These findings underscore the significant enhancement in computational speed and speaker recognition performance facilitated by feature optimization strategies.
引用
收藏
页码:439 / 469
页数:31
相关论文
共 50 条
  • [21] FPGA-Based Sparsity-Aware CNN Accelerator for Noise-Resilient Edge-Level Image Recognition
    Moon, Seungsik
    Lee, Hyunhoon
    Byun, Younghoon
    Park, Jongmin
    Joe, Junseo
    Hwang, Seokha
    Lee, Sunggu
    Lee, Youngjoo
    2019 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC), 2019, : 205 - 208
  • [22] Speaker Independent Turkish Speech Recognition Optimization with Energy Derivates on Feature Vectors
    Cakir, Mert Yilmaz
    Sirin, Yahya
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [23] Enhancing Biometric Speaker Recognition Through MFCC Feature Extraction and Polar Codes for Remote Application
    Wankhede, Nilashree
    Wagh, Sushama
    IEEE ACCESS, 2023, 11 : 133921 - 133930
  • [24] Noise-robust speaker recognition using subband likelihoods and reliable-feature selection
    Kim, Sungtak
    Ji, Mikyong
    Kim, Hoirin
    ETRI JOURNAL, 2008, 30 (01) : 89 - 100
  • [25] Automatic speaker recognition using a unique personal feature vector and Gaussian Mixture Models
    Kaminski, Kamil
    Majda, Ewelina
    Dobrowolski, Andrzej P.
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 220 - 225
  • [26] Scene Text Recognition with Eliminating Background Noise and Enhancing Characters Shape Feature
    Tang, Shancheng
    Liang, Shaojun
    Lu, Biao
    Zhang, Ying
    Jin, Zicheng
    Lu, Jianhui
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (06): : 875 - 883
  • [27] Application of Differential Evolution Optimization based Gaussian Mixture Models to Speaker Recognition
    Zhou Hong
    Zhang JianHua
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 4297 - 4302
  • [28] Optimization of feature vectors for art classifier in language independent speaker recognition system for biometric security
    Albin, A. Jose
    Nandhitha, N. M.
    BIOMEDICAL RESEARCH-INDIA, 2016, 27 : S314 - S321
  • [29] Wavelet feature domain adaptive noise reduction using learning algorithm for text-independent speaker recognition
    Lung, Shung-Yung
    PATTERN RECOGNITION, 2007, 40 (09) : 2603 - 2606
  • [30] Automatic feature recognition and tool path strategies for enhancing accuracy in double sided incremental forming
    R. Lingam
    Om Prakash
    J. H. Belk
    N. V. Reddy
    The International Journal of Advanced Manufacturing Technology, 2017, 88 : 1639 - 1655