Enhancing Speaker Recognition Models with Noise-Resilient Feature Optimization Strategies

被引:2
|
作者
Chauhan, Neha [1 ]
Isshiki, Tsuyoshi [1 ]
Li, Dongju [1 ]
机构
[1] Tokyo Inst Technol, Dept Informat & Commun Engn, Tokyo 1528550, Japan
来源
ACOUSTICS | 2024年 / 6卷 / 02期
关键词
speaker identification; speaker verification; feature-level fusion; dimension reduction; feature optimization; PCA;
D O I
10.3390/acoustics6020024
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper delves into an in-depth exploration of speaker recognition methodologies, with a primary focus on three pivotal approaches: feature-level fusion, dimension reduction employing principal component analysis (PCA) and independent component analysis (ICA), and feature optimization through a genetic algorithm (GA) and the marine predator algorithm (MPA). This study conducts comprehensive experiments across diverse speech datasets characterized by varying noise levels and speaker counts. Impressively, the research yields exceptional results across different datasets and classifiers. For instance, on the TIMIT babble noise dataset (120 speakers), feature fusion achieves a remarkable speaker identification accuracy of 92.7%, while various feature optimization techniques combined with K nearest neighbor (KNN) and linear discriminant (LD) classifiers result in a speaker verification equal error rate (SV EER) of 0.7%. Notably, this study achieves a speaker identification accuracy of 93.5% and SV EER of 0.13% on the TIMIT babble noise dataset (630 speakers) using a KNN classifier with feature optimization. On the TIMIT white noise dataset (120 and 630 speakers), speaker identification accuracies of 93.3% and 83.5%, along with SV EER values of 0.58% and 0.13%, respectively, were attained utilizing PCA dimension reduction and feature optimization techniques (PCA-MPA) with KNN classifiers. Furthermore, on the voxceleb1 dataset, PCA-MPA feature optimization with KNN classifiers achieves a speaker identification accuracy of 95.2% and an SV EER of 1.8%. These findings underscore the significant enhancement in computational speed and speaker recognition performance facilitated by feature optimization strategies.
引用
收藏
页码:439 / 469
页数:31
相关论文
共 50 条
  • [31] Automatic feature recognition and tool path strategies for enhancing accuracy in double sided incremental forming
    Lingam, R.
    Prakash, Om
    Belk, J. H.
    Reddy, N. V.
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2017, 88 (5-8): : 1639 - 1655
  • [32] MULTICONDITION TRAINING OF GAUSSIAN PLDA MODELS IN I-VECTOR SPACE FOR NOISE AND REVERBERATION ROBUST SPEAKER RECOGNITION
    Garcia-Romero, Daniel
    Zhou, Xinhui
    Espy-Wilson, Carol Y.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4257 - 4260
  • [33] Optimization of multidimensional feature engineering and data partitioning strategies in heart disease prediction models
    Wang, Shanshan
    Zhang, Lei
    Liu, Xiao
    Sun, Jiuye
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 107 : 932 - 949
  • [34] Optimization Strategies in Planned Economy Models: Enhancing Global Profit through Quantitative Forecasting
    Fridhi, Bechir
    Almutairi, Ngeyan Nega
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2025, 20 (01):
  • [35] Enhancing Gait Recognition in Lower Limb Exoskeletons: Adaptive Feature Selection and Random Forest With Bayesian Optimization
    Lin, Haibo
    Guo, Xudong
    Zhong, Fengqi
    Cui, Haipo
    Zhao, Zhan
    Geng, Haonan
    Zhang, Guojie
    JOURNAL OF MEDICAL DEVICES-TRANSACTIONS OF THE ASME, 2025, 19 (01):
  • [36] Enhancing BCI-Based Emotion Recognition Using an Improved Particle Swarm Optimization for Feature Selection
    Li, Zina
    Qiu, Lina
    Li, Ruixin
    He, Zhipeng
    Xiao, Jun
    Liang, Yan
    Wang, Fei
    Pan, Jiahui
    SENSORS, 2020, 20 (11)
  • [37] Reverb and Noise as Real-World Effects in Speech Recognition Models: A Study and a Proposal of a Feature Set
    Cesarini, Valerio
    Costantini, Giovanni
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [38] Robust speech recognition based on joint model and feature space optimization of hidden Markov models
    Moon, S
    Hwang, JN
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (02): : 194 - 204
  • [39] Feature-based hybrid strategies for gradient descent optimization in end-to-end speech recognition
    Dokuz, Yesim
    Tufekci, Zekeriya
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (07) : 9969 - 9988
  • [40] Feature-based hybrid strategies for gradient descent optimization in end-to-end speech recognition
    Yesim Dokuz
    Zekeriya Tüfekci
    Multimedia Tools and Applications, 2022, 81 : 9969 - 9988