One Solution of Extension of Mel-Frequency Cepstral Coefficients Feature Vector for Automatic Speaker Recognition

被引:3
|
作者
Jokic, Ivan D. [1 ,2 ]
Jokic, Stevan D. [2 ,3 ]
Delic, Vlado D. [1 ]
Peric, Zoran H. [4 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Trg Dositeja Obradovica 6, Novi Sad 21000, Serbia
[2] Svezdrav Resenja, Denerala Draze 44, Klenje 15357, Serbia
[3] Fac Econ & Engn Management Novi Sad, Cvecarska 2, Novi Sad 21000, Serbia
[4] Univ Nis, Fac Elect Engn, Aleksandra Medvedeva 14, Nish 18000, Serbia
来源
INFORMATION TECHNOLOGY AND CONTROL | 2020年 / 49卷 / 02期
关键词
Speaker recognition; spectrum; mel-frequency cepstral coefficients; energy; maximum;
D O I
10.5755/j01.itc.49.2.22258
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One extension of mel-frequency cepstral feature vector for automatic speaker recognition is considered in this paper. The starting feature vector consisted of 18 mel-frequency cepstral coefficients (MFCCs). The extension was done with two additional features derived from the appropriate spectral maximums of the speech signal. The main idea behind this research is that it is possible to increase the accuracy of automatic speaker recognition which uses only MFCCs by adding additional features based on the energy maximums in the appropriate frequency ranges of observed speech frames. In the experiments, accuracy and equal error rate (EER) are compared in the case when feature vectors contain only MFCCs and in cases when additional features are used. For the case of maximum recognition accuracy achieved (92.94%), recognition accuracy increased by around 2.43%. EER values have smaller differentiation, but the results show that adding proposed additional features produced a lower decision threshold. These results indicate that tracking of proposed spectral maxima in the spectrum of the speech signal leads to more accurate automatic speaker recognizer. Determining features which track real maxima in the speech spectrum will improve the procedure of automatic speaker recognition and enable avoiding complex models.
引用
收藏
页码:224 / 236
页数:13
相关论文
共 50 条
  • [1] Mel-Frequency Cepstral Coefficients as Features for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    [J]. 2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 419 - 424
  • [2] Automatic recognition of birdsongs using mel-frequency cepstral coefficients and vector quantization
    Lee, Chang-Hsing
    Lien, Cheng-Chang
    Huang, Ren-Zhuang
    [J]. IMECS 2006: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, 2006, : 331 - +
  • [3] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Memon, Sheeraz
    Bhatti, Sania
    Abro, Farzana Rauf
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
  • [4] Automatic Speaker Recognition Using Mel-Frequency Cepstral Coefficients Through Machine Learning
    Ayvaz, Ugur
    Guruler, Huseyin
    Khan, Faheem
    Ahmed, Naveed
    Whangbo, Taegkeun
    Bobomirzaevich, Abdusalomov Akmalbek
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (03): : 5511 - 5521
  • [5] Fingerprint Recognition Using Mel-Frequency Cepstral Coefficients
    Hashad F.G.
    Halim T.M.
    Diab S.M.
    Sallam B.M.
    El-Samie F.E.A.
    [J]. Pattern Recognition and Image Analysis, 2010, 20 (03) : 360 - 369
  • [6] Automatic Voice Recognition System based on Multiple Support Vector Machines and Mel-Frequency Cepstral Coefficients
    Barbosa, Felipe Gomes
    Santos Silva, Washington Luis
    [J]. 2015 11TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2015, : 665 - 670
  • [7] Voice Recognition and Marking Using Mel-frequency Cepstral Coefficients
    Sheu, Jia-Shing
    Chen, Ching-Wen
    [J]. SENSORS AND MATERIALS, 2020, 32 (10) : 3209 - 3220
  • [8] A Wavelet Packet and Mel-Frequency Cepstral Coefficients-Based Feature Extraction Method for Speaker Identification
    Turner, Claude
    Joseph, Anthony
    [J]. COMPLEX ADAPTIVE SYSTEMS, 2015, 2015, 61 : 416 - 421
  • [9] Speaker independent phoneme recognition based on fractal dimension (DF) and the mel-frequency cepstral coefficients features
    Fekkai, S
    Al-Akaidi, M
    Blackledge, JM
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4014 - 4014
  • [10] Mel Frequency Cepstral Coefficients Based Text Independent Automatic Speaker Recognition Using Matlab
    Singh, Amit Kumar
    Singh, Rohit
    Dwivedi, Ashutosh
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), 2014, : 524 - 527