A Wavelet Packet and Mel-Frequency Cepstral Coefficients-Based Feature Extraction Method for Speaker Identification

被引:23
|
作者
Turner, Claude [1 ]
Joseph, Anthony [2 ]
机构
[1] Norfolk State Univ, Dept Comp Sci, Norfolk, VA 23504 USA
[2] Pace Univ, Dept Comp Sci, 163 William St, New York, NY 10038 USA
来源
关键词
Cepstral Coefficients; Speaker Recognition; Wavelet Packets; RECOGNITION; TUTORIAL; MODEL;
D O I
10.1016/j.procs.2015.09.177
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the most widely used approaches for feature extraction in speaker recognition is the filter bank-based Mel Frequency Cepstral Coefficients (MFCC) approach. The main goal of feature extraction in this context is to extract features from raw speech that captures the unique characteristics of a particular individual. During the feature extraction process, the discrete Fourier transform (DFT) is typically employed to compute the spectrum of the speech waveform. However, over the past few years, the discrete wavelet transform (DWT) has gained remarkable attention, and has been favored over the DFT in a wide variety of applications. The wavelet packet transform (WPT) is an extension of the DWT that adds more flexibility to the decomposition process. This work is a study of the impact on performance, with respect to accuracy and efficiency, when the WPT is used as a substitute for the DFT in the MFCC method. The novelty of our approach lies in its concentration on the wavelet and the decomposition level as the parameters influencing the performance. We compare the performance of the DFT with the WPT, as well as with our previous work using the DWT. It is shown that the WPT results in significantly lower order for the Gaussian Mixture Model (GMM) used to model speech, and marginal improvement in accuracy with respect to the DFT. WPT mirrors DWT in terms of the order of GMM and can perform as well as the DWT under certain conditions. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:416 / 421
页数:6
相关论文
共 50 条
  • [1] SWMAT: Mel-frequency cepstral coefficients-based memory fingerprinting for IoT devices
    Vijayakanthan, Ramyapandian
    Ahmed, Irfan
    Ali-Gombe, Aisha
    [J]. COMPUTERS & SECURITY, 2023, 132
  • [2] Mel-Frequency Cepstral Coefficients as Features for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    [J]. 2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 419 - 424
  • [3] Wavelet Packet Based Mel Frequency Cepstral Features for Text Independent Speaker Identification
    Srivastava, Smriti
    Bhardwaj, Saurabh
    Bhandari, Abhishek
    Gupta, Krit
    Bahl, Hitesh
    Gupta, J. R. P.
    [J]. INTELLIGENT INFORMATICS, 2013, 182 : 237 - 247
  • [4] Mel-frequency Cepstral Coefficients for Eye Movement Identification
    Nguyen Viet Cuong
    Vu Dinh
    Lam Si Tung Ho
    [J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 253 - 260
  • [5] Phase Based Mel Frequency Cepstral Coefficients for Speaker Identification
    Srivastava, Sumit
    Chandra, Mahesh
    Sahoo, G.
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 3, INDIA 2016, 2016, 435 : 309 - 316
  • [6] Iris feature extraction through wavelet mel-frequency cepstrum coefficients
    Barpanda, Soubhagya Sankar
    Majhi, Banshidhar
    Sa, Panjak Kumar
    Sangaiah, Arun Kumar
    Bakshi, Sambit
    [J]. OPTICS AND LASER TECHNOLOGY, 2019, 110 : 13 - 23
  • [7] One Solution of Extension of Mel-Frequency Cepstral Coefficients Feature Vector for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2020, 49 (02): : 224 - 236
  • [8] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Memon, Sheeraz
    Bhatti, Sania
    Abro, Farzana Rauf
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
  • [9] Identification of Language using Mel-Frequency Cepstral Coefficients (MFCC)
    Koolagudi, Shashidhar G.
    Rastogi, Deepika
    Rao, K. Sreenivasa
    [J]. INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 3391 - 3398
  • [10] Modified Mel-frequency Cepstral Coefficients (MMFCC) in Robust Text-dependent Speaker Identification
    Islam, Md. Atiqul
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL ENGINEERING (ICAEE), 2017, : 505 - 509