Processing of linear prediction residual in spectral and cepstral domains for speaker information

被引:5
|
作者
Pati D. [1 ]
Prasanna S.R.M. [1 ]
机构
[1] Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati
来源
Int J Speech Technol | / 3卷 / 333-350期
关键词
LP residual; M-PDSS; R-MFCC; R-MSE; Source information; Speaker recognition; Spectral and cepstral domains;
D O I
10.1007/s10772-015-9273-9
中图分类号
学科分类号
摘要
In this work the linear prediction (LP) residual is processed in spectral and cepstral domains to model the speaker-specific excitation information. In the spectral domain, the excitation energy information is modeled from subband energies (SBE). The excitation periodicity information is modeled by power differences of spectrum in subband (PDSS) measure. This work carries some refinements in the existing methods of extracting SBE and PDSS by exploiting the nature of the excitation spectrum. The SBE and PDSS values are computed from mel warped residual subband spectrum and called as residual mel subband energies (R-MSE) and mel power differences of subband spectra (M-PDSS), respectively. The different speaker recognition studies performed using NIST-99 and NIST-03 databases demonstrate that R-MSE and M-PDSS features represent good speaker information. It is also demonstrated that the excitation energy information can be better modeled in the cepstral domain by residual mel frequency cepstral coefficients (R-MFCC). Furhter, the evidences provided by M-PDSS and R-MFCC features are different and combine well and provides improved recognition performance. The combined evidence from M-PDSS and R-MFCC together with the vocal tract information further improves the performance. Finally, a comparative study on processing the LP residual in temporal, spectral and cepstral domains demonstrates that with a small compromise with the recognition performance, processing LP residual in spectral and cepstral domains provide compact and effective way of representing the excitation information, as compared to temporal processing. © 2015, Springer Science+Business Media New York.
引用
收藏
页码:333 / 350
页数:17
相关论文
共 50 条
  • [11] Linear prediction residual features for automatic speaker verification anti-spoofing
    Cemal Hanilçi
    Multimedia Tools and Applications, 2018, 77 : 16099 - 16111
  • [12] Linear Prediction Residual-Based Constant-Q Cepstral Coefficients for Replay Attack Detection
    Phapatanaburi, Khomdet
    Buayai, Prawit
    Kupimai, Mongkol
    Yodrot, Teerapon
    2020 8TH INTERNATIONAL ELECTRICAL ENGINEERING CONGRESS (IEECON), 2020,
  • [13] Processing linear prediction residual signal to counter replay attacks
    Mishra, Jagabandhu
    Singh, Madhusudan
    Pati, Debadatta
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 95 - 99
  • [14] Implicit processing of linear prediction residual for replay attack detection
    Veesa, Suresh
    Singh, Madhusudan
    International Journal of Speech Technology, 2024, 27 (03) : 781 - 791
  • [15] Speaker Verification Anti-Spoofing Using Linear Prediction Residual Phase Features
    Hanilci, Cemal
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 96 - 100
  • [16] Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
    Hora, Baveet Singh
    Uthiraa, S.
    Patil, Hemant A.
    SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 116 - 129
  • [17] Feature detection based on linear prediction residual for Spoofing countermeasures of speaker verification system
    Chen, Min
    Yu, Yibiao
    FIFTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2020, 11526
  • [18] Restoring the Residual Speaker Information in Total Variability Modeling for Speaker Verification
    Zhang, Ce
    Zheng, Rong
    Xu, Bo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 132 - 135
  • [19] Residual Information in Deep Speaker Embedding Architectures
    Stan, Adriana
    MATHEMATICS, 2022, 10 (21)
  • [20] Usefulness of residual-based features in speaker verification and their combination way with linear prediction coefficients
    Hsu, Wei-Chih
    Lai, Wen-Hsing
    Hong, Wei-Ping
    ISM WORKSHOPS 2007: NINTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA - WORKSHOPS, PROCEEDINGS, 2007, : 246 - 251