Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

被引:10
|
作者
Nainan, Sumita [1 ]
Kulkarni, Vaishali [1 ]
机构
[1] SVKMs NMIMS Deemed Univ, Mumbai, Maharashtra, India
关键词
ASR; 1-D CNN; SVM; GMM; Fisher score; ROBUST; IDENTIFICATION; VERIFICATION; FUSION; NOISE;
D O I
10.1007/s10772-020-09771-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Contemporary automatic speaker recognition (ASR) systems do not provide 100% accuracy making it imperative to explore different techniques to improve it. Easy access to mobile devices and advances in sensor technology, has made voice a preferred parameter for biometrics. Here, a comparative analysis of accuracies obtained in ASR with employment of classical Gaussian mixture model (GMM), support vector machine (SVM) which is the machine learning algorithm and the state of art 1-D CNN as classifiers is presented. Authors propose considering dynamic voice features along with static features as relevant speaker information in them lead to substantial improvement in the accuracy for ASR. As concatenation of features leads to the redundancy and increased computation complexity, Fisher score algorithm was employed to select the best contributing features resulting in improvement in accuracy. The results indicate that SVM and 1-D Neural network outperform GMM. Support Vector Machine (SVM), and 1-D CNN gave comparable results with 1-D CNN giving an improved accuracy of 94.77% in ASR.
引用
收藏
页码:809 / 822
页数:14
相关论文
共 50 条
  • [41] A Radar HRRP Target Recognition Method Based on Conditional Wasserstein VAEGAN and 1-D CNN
    He, Jiaxing
    Wang, Xiaodan
    Xiang, Qian
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 762 - 777
  • [42] Research on epileptic EEG recognition based on improved residual networks of 1-D CNN and indRNN
    Ma, Mengnan
    Cheng, Yinlin
    Wei, Xiaoyan
    Chen, Ziyi
    Zhou, Yi
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (SUPPL 2)
  • [43] Research on epileptic EEG recognition based on improved residual networks of 1-D CNN and indRNN
    Mengnan Ma
    Yinlin Cheng
    Xiaoyan Wei
    Ziyi Chen
    Yi Zhou
    BMC Medical Informatics and Decision Making, 21
  • [44] Robust Recognition of 1-D Barcodes Using Camera Phones
    Wachenfeld, Steffen
    Terlunen, Sebastian
    Jiang, Xiaoyi
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 583 - 586
  • [45] Robust Recognition of 1-D Barcodes Using Hough Transform
    Dwinell, John
    Bian, Peng
    Bian, Long Xiang
    IMAGE PROCESSING: MACHINE VISION APPLICATIONS V, 2012, 8300
  • [46] Analysis of partial iris recognition using a 1-D approach
    Du, YZ
    Bonney, B
    Ives, R
    Etter, D
    Schultz, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 961 - 964
  • [47] Human Emotion Recognition by Integrating Facial and Speech Features: An Implementation of Multimodal Framework using CNN
    Srinivas, P. V. V. S.
    Mishra, Pragnyaban
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (01) : 592 - 603
  • [48] Application of 1-D CNN to Predict Epileptic Seizures using EEG Records
    Khalilpour, Simin
    Ranjbar, Amin
    Menhaj, Mohammad Bagher
    Sandooghdar, Afshin
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 314 - 318
  • [49] Kurdish Dialect Recognition using 1D CNN
    Ghafoor, Karzan J.
    Rawf, Karwan M. Hama
    Abdulrahman, Ayub O.
    Taher, Sarkhel H.
    ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 2021, 9 (02):
  • [50] Affect-insensitive speaker recognition systems via emotional speech clustering using prosodic features
    Dongdong Li
    Yubo Yuan
    Zhaohui Wu
    Yingchun Yang
    Neural Computing and Applications, 2015, 26 : 473 - 484