Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN

被引:0
|
作者
Sumita Nainan
Vaishali Kulkarni
机构
[1] SVKM’s NMIMS Deemed To Be University,
关键词
ASR; 1-D CNN; SVM; GMM; Fisher score;
D O I
暂无
中图分类号
学科分类号
摘要
Contemporary automatic speaker recognition (ASR) systems do not provide 100% accuracy making it imperative to explore different techniques to improve it. Easy access to mobile devices and advances in sensor technology, has made voice a preferred parameter for biometrics. Here, a comparative analysis of accuracies obtained in ASR with employment of classical Gaussian mixture model (GMM), support vector machine (SVM) which is the machine learning algorithm and the state of art 1-D CNN as classifiers is presented. Authors propose considering dynamic voice features along with static features as relevant speaker information in them lead to substantial improvement in the accuracy for ASR. As concatenation of features leads to the redundancy and increased computation complexity, Fisher score algorithm was employed to select the best contributing features resulting in improvement in accuracy. The results indicate that SVM and 1-D Neural network outperform GMM. Support Vector Machine (SVM), and 1-D CNN gave comparable results with 1-D CNN giving an improved accuracy of 94.77% in ASR.
引用
收藏
页码:809 / 822
页数:13
相关论文
共 50 条
  • [1] Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN
    Nainan, Sumita
    Kulkarni, Vaishali
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 809 - 822
  • [2] A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN
    Tiwari, Pradeep
    Darji, A. D.
    [J]. INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2022, 7 (01) : 49 - 67
  • [3] GMM supervector based SVM with spectral features for speech emotion recognition
    Hu, Hao
    Xu, Ming-Xing
    Wu, Wei
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 413 - +
  • [4] Speech/speaker recognition using a HMM/GMM hybrid model
    Rodriguez, E
    Ruiz, B
    Garcia-Crespo, A
    Garcia, F
    [J]. AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, 1997, 1206 : 227 - 234
  • [5] Text-independent speaker recognition using probabilistic SVM with GMM adjustment
    Hou, FL
    Wang, BX
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 305 - 308
  • [6] 1D-CNN: Speech Emotion Recognition System Using a Stacked Network with Dilated CNN Features
    Mustaqeem
    Kwon, Soonil
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 67 (03): : 4039 - 4059
  • [7] Source and System Features for Text Independent Speaker Recognition Using GMM Speaker Models
    Revathi, A.
    Venkataramani, Y.
    [J]. RECENT TRENDS IN NETWORKS AND COMMUNICATIONS, 2010, 90 : 21 - +
  • [8] Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach
    Maurya, Ankur
    Kumar, Divya
    Agarwal, R. K.
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS, 2018, 125 : 880 - 887
  • [9] Speaker Dependent, Speaker Independent and Cross Language Emotion Recognition From Speech Using GMM and HMM
    Bhaykar, Manav
    Yadav, Jainath
    Rao, K. Sreenivasa
    [J]. 2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
  • [10] PSO optimized 1-D CNN-SVM architecture for real-time detection and classification applications
    Navaneeth, Bhaskar
    Suchetha, M.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 108 : 85 - 92