Voice source cepstrum coefficients for speaker identification

被引:47
|
作者
Gudnason, Jon [1 ]
Brookes, Mike [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
关键词
vocal systems; speech analysis; cepstral analysis; speaker recognition;
D O I
10.1109/ICASSP.2008.4518736
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a novel feature set for speaker recognition that is based on the voice source signal. The feature extraction process uses closed-pbase LPC analysis to estimate the vocal tract transfer function. The LPC spectrum envelope is converted to cepstrurn coefficients which are used to derive the voice source features. Unlike approaches based on inverse-filtering, our procedure is robust to LPC analysis errors and low-frequency phase distortion. We have performed text-independent closed-set speaker identification experiments on the TIMIT and the YOHO databases using a standard Gaussian mixture model technique. Compared to using melfrequency cepstrum coefficients, the misclassification rate for the TIMIT database reduced from 1.51% to 0. 16% when combined with the proposed voice source features. For the YOHO database the niisclassification rate decreased from 13.79% to 1.0.07%. The new feature vector also compares favourably to other proposed voice source feature sets.
引用
收藏
页码:4821 / 4824
页数:4
相关论文
共 50 条
  • [1] Speaker Dependent Frequency Cepstrum Coefficients
    Orsag, Filip
    [J]. SECURITY TECHNOLOGY, PROCEEDINGS, 2009, 58 : 258 - 264
  • [2] Gender Identification Of A Speaker From Voice Source
    Yucesoy, Ergun
    Nabiyev, Vasif V.
    [J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [3] Text-independent speaker identification system based on the histogram of DCT-cepstrum coefficients
    Al-Rawahy, S.
    Hossen, A.
    Heute, U.
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2012, 16 (03) : 141 - 161
  • [4] Comparison of linear prediction cepstrum coefficients and Mel-Frequency Cepstrum Coefficients for language identification
    Wong, E
    Sridharan, S
    [J]. PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 95 - 98
  • [5] Residual Phase Cepstrum Coefficients with Application to Cross-lingual Speaker Verification
    Wang, Jianglin
    Johnson, Michael T.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1554 - 1557
  • [6] Is voice transformation a threat to speaker identification?
    Jin, Qin
    Toth, Arthur R.
    Black, Alan W.
    Schultz, Tanja
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4845 - 4848
  • [7] Speaker Identification:Variations of a Human voice
    Kinkiri, Saritha
    Keates, Simeon
    [J]. PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATION ENGINEERING (ICACCE-2020), 2020,
  • [8] Voice quality and forensic speaker identification
    Nolan, Francis
    [J]. GOVOR, 2007, 24 (02) : 111 - 128
  • [9] Voice source characterization using pitch synchronous discrete cosine transform for speaker identification
    Ramakrishnan, A. G.
    Abhiram, B.
    Prasanna, S. R. Mahadeva
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (06): : EL469 - EL475
  • [10] Fast adaptive component weighted cepstrum pole filtering for speaker identification
    Swanson, AL
    Ramachandran, RR
    Chin, SH
    [J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 5, PROCEEDINGS, 2004, : 612 - 615