Noise robust estimate of speech dynamics for speaker recognition

被引:0
|
作者
Openshaw, JP
Mason, JS
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates the robustness of cepstral based features with respect to additive noise, and details two methods of increasing the robustness with minimal need for o-priori knowledge of the noise statistics. The first approach is a form of noise masking which adds a fixed offset to the linear spectral estimate. The second is a form of sub-band filtering, again in the linear domain, which estimates the dynamic content of the speech using Fourier transforms. This avoids negative values normally inherent in such filtering and which presents difficulties in deriving log estimates. Both methods are shown to provide useful levels of robustness to additive noise, for example, speaker identification error rates in SNR mis-matched conditions of 15 dB are reduced from 60.5% for standard mel cepstra to 13.8% and 24.1% for the two approaches respectively.
引用
收藏
页码:925 / 928
页数:4
相关论文
共 50 条
  • [1] Speaker and Noise Factorization for Robust Speech Recognition
    Wang, Yongqiang
    Gales, Mark J. F.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 2149 - 2158
  • [2] Efficient Speaker and Noise Normalization for Robust Speech Recognition
    Joshi, Vikas
    Bilgi, Raghavendra
    Umesh, S.
    Benitez, C.
    Garcia, L.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2612 - 2615
  • [3] A novel channel estimate for noise robust speech recognition
    Vanderreydt, Geoffroy
    Demuynck, Kris
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [4] RAPID JOINT SPEAKER AND NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION
    Chin, K. K.
    Xu, Haitian
    Gales, Mark J. F.
    Breslin, Catherine
    Knill, Kate
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5500 - 5503
  • [5] EXPLOITING LONG-RANGE TEMPORAL DYNAMICS OF SPEECH FOR NOISE-ROBUST SPEAKER RECOGNITION
    Jafari, Ayeh
    Srinivasan, Ramji
    Crookes, Danny
    Ming, Ji
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2123 - 2127
  • [6] Speaker normalized spectral subband parameters for noise robust speech recognition
    Tsuge, Satoru
    Fukada, Toshiaki
    Singer, Harald
    Paliwal, Kuldip K.
    Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 1999, 20 (06): : 425 - 431
  • [7] Speaker normalized spectral subband parameters for noise robust speech recognition
    Tsuge, S
    Fukada, T
    Singer, H
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 285 - 288
  • [8] An integrated study of speaker normalisation and HMM adaptation for noise robust speaker-independent speech recognition
    Hariharan, R
    Viikki, O
    SPEECH COMMUNICATION, 2002, 37 (3-4) : 349 - 361
  • [9] MULTILEVEL SPEECH INTELLIGIBILITY FOR ROBUST SPEAKER RECOGNITION
    Nemala, Sridhar Krishna
    Elhilali, Mounya
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4393 - 4396
  • [10] Noise Robust Voice Detector for Speaker Recognition
    Hernandez, Gabriel
    Calvo, Jose R.
    Fernandez, Rafael
    Rodes, Ivis
    Martinez, Rafael
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2605 - 2608