Noise robust estimate of speech dynamics for speaker recognition

被引:0
|
作者
Openshaw, JP
Mason, JS
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates the robustness of cepstral based features with respect to additive noise, and details two methods of increasing the robustness with minimal need for o-priori knowledge of the noise statistics. The first approach is a form of noise masking which adds a fixed offset to the linear spectral estimate. The second is a form of sub-band filtering, again in the linear domain, which estimates the dynamic content of the speech using Fourier transforms. This avoids negative values normally inherent in such filtering and which presents difficulties in deriving log estimates. Both methods are shown to provide useful levels of robustness to additive noise, for example, speaker identification error rates in SNR mis-matched conditions of 15 dB are reduced from 60.5% for standard mel cepstra to 13.8% and 24.1% for the two approaches respectively.
引用
收藏
页码:925 / 928
页数:4
相关论文
共 50 条
  • [41] Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech
    Sahidullah, Md
    Hautamaki, Rosa Gonzalez
    Thomsen, Dennis Alexander Lehmann
    Kinntinenl, Tomi
    Tang, Zheng-Hua
    Hautamaki, Ville
    Parts, Robert
    Pitkanen, Martti
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1720 - 1724
  • [42] Feature enhancement by speaker-normalized splice for robust speech recognition
    Shinohara, Yusuke
    Masuko, Takashi
    Akamine, Masami
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4881 - 4884
  • [43] An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition
    Tsao, Yu
    Lee, Chin-Hui
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 1025 - 1037
  • [44] REFERENCE EIGEN-ENVIRONMENT AND SPEAKER WEIGHTING FOR ROBUST SPEECH RECOGNITION
    Liao, Yuan-Fu
    Fang, Hung-Hsiang
    Yang, Chih-Min
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 77 - 80
  • [45] COMBINING SPEAKER AND NOISE FEATURE NORMALIZATION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION
    Garcia, L.
    Benitez, C.
    Segura, J. C.
    Umesh, S.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5496 - 5499
  • [46] A Study of Additive Noise Model for Robust Speech Recognition
    Awatade, Manisha H.
    2ND INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-11), 2011, 1414
  • [47] Extended VTS for Noise-Robust Speech Recognition
    van Dalen, Rogier C.
    Gales, Mark J. F.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
  • [48] Noise robust automatic speech recognition: review and analysis
    Dua M.
    Akanksha
    Dua S.
    International Journal of Speech Technology, 2023, 26 (02) : 475 - 519
  • [49] Noise robust speech recognition with state duration constraints
    Laurila, K
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 871 - 874
  • [50] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777