Voice source cepstrum coefficients for speaker identification

被引:47
|
作者
Gudnason, Jon [1 ]
Brookes, Mike [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
关键词
vocal systems; speech analysis; cepstral analysis; speaker recognition;
D O I
10.1109/ICASSP.2008.4518736
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a novel feature set for speaker recognition that is based on the voice source signal. The feature extraction process uses closed-pbase LPC analysis to estimate the vocal tract transfer function. The LPC spectrum envelope is converted to cepstrurn coefficients which are used to derive the voice source features. Unlike approaches based on inverse-filtering, our procedure is robust to LPC analysis errors and low-frequency phase distortion. We have performed text-independent closed-set speaker identification experiments on the TIMIT and the YOHO databases using a standard Gaussian mixture model technique. Compared to using melfrequency cepstrum coefficients, the misclassification rate for the TIMIT database reduced from 1.51% to 0. 16% when combined with the proposed voice source features. For the YOHO database the niisclassification rate decreased from 13.79% to 1.0.07%. The new feature vector also compares favourably to other proposed voice source feature sets.
引用
收藏
页码:4821 / 4824
页数:4
相关论文
共 50 条
  • [41] Online speaker de-identification using voice transformation
    Pobar, M.
    Ipsic, I.
    [J]. 2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 1264 - 1267
  • [42] THE VOICE AS A REFLECTION OF SPEAKER PHYSICAL CHARACTERISTICS - EXPERIMENTS ON SPEAKER RACE, SEX, HEIGHT, AND WEIGHT IDENTIFICATION
    LASS, NJ
    [J]. FOLIA PHONIATRICA, 1980, 32 (03): : 212 - 213
  • [43] Cross-speaker Variation in Voice Source Correlates of Focus and Deaccentuation
    Yanushevskaya, Irena
    Chasaide, Ailbhe Ni
    Gobl, Christer
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1034 - 1038
  • [44] Speaker adaptive voice source modeling with applications to speech coding and processing
    Drioli, Carlo
    Calanca, Andrea
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (05): : 1195 - 1208
  • [45] Speaker Recognition Based on Weighted Mel-cepstrum
    Yang Hong-wu
    Liu Ya-li
    Huang De-zhi
    [J]. ICCIT: 2009 FOURTH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND CONVERGENCE INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2009, : 200 - +
  • [46] Cancelable speaker identification based on cepstral coefficients and comb filters
    Monir M.
    Kareem M.
    El-Dolil S.M.
    Saleeb A.
    El-Fishawy A.S.
    Nassar M.A.-E.
    Zein Eldin M.A.
    Abd El-Samie F.E.
    [J]. Int J Speech Technol, 2 (471-492): : 471 - 492
  • [47] Compensated Mel frequency cepstrum coefficients
    Vergin, R
    OShaughnessy, D
    Gupta, V
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 323 - 326
  • [48] Phase Based Mel Frequency Cepstral Coefficients for Speaker Identification
    Srivastava, Sumit
    Chandra, Mahesh
    Sahoo, G.
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 3, INDIA 2016, 2016, 435 : 309 - 316
  • [49] Speaker Identification Using Voice-Based Cryptography for Mobile VoIP Secure Voice Communication
    Ryu, Sang-Hyeon
    Kim, Hyoung-Gook
    [J]. 2013 THIRD WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES (WICT), 2013, : 94 - 97
  • [50] Noise Robust Speaker Verification with Delta Cepstrum Normalization
    Kanda, Naoyuki
    Takeda, Ryu
    Obuchi, Yasunari
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3111 - 3115