Noise robust estimate of speech dynamics for speaker recognition

被引：0

作者：

Openshaw, JP

Mason, JS

机构：

来源：

ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper investigates the robustness of cepstral based features with respect to additive noise, and details two methods of increasing the robustness with minimal need for o-priori knowledge of the noise statistics. The first approach is a form of noise masking which adds a fixed offset to the linear spectral estimate. The second is a form of sub-band filtering, again in the linear domain, which estimates the dynamic content of the speech using Fourier transforms. This avoids negative values normally inherent in such filtering and which presents difficulties in deriving log estimates. Both methods are shown to provide useful levels of robustness to additive noise, for example, speaker identification error rates in SNR mis-matched conditions of 15 dB are reduced from 60.5% for standard mel cepstra to 13.8% and 24.1% for the two approaches respectively.

引用

页码：925 / 928

页数：4

共 50 条

[41] Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech
Sahidullah, Md
Hautamaki, Rosa Gonzalez
Thomsen, Dennis Alexander Lehmann
Kinntinenl, Tomi
Tang, Zheng-Hua
Hautamaki, Ville
Parts, Robert
Pitkanen, Martti
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1720 - 1724
[42] Feature enhancement by speaker-normalized splice for robust speech recognition
Shinohara, Yusuke
Masuko, Takashi
Akamine, Masami
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4881 - 4884
[43] An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition
Tsao, Yu
Lee, Chin-Hui
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (05): : 1025 - 1037
[44] REFERENCE EIGEN-ENVIRONMENT AND SPEAKER WEIGHTING FOR ROBUST SPEECH RECOGNITION
Liao, Yuan-Fu
Fang, Hung-Hsiang
Yang, Chih-Min
2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 77 - 80
[45] COMBINING SPEAKER AND NOISE FEATURE NORMALIZATION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION
Garcia, L.
Benitez, C.
Segura, J. C.
Umesh, S.
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5496 - 5499
[46] A Study of Additive Noise Model for Robust Speech Recognition
Awatade, Manisha H.
2ND INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-11), 2011, 1414
[47] Extended VTS for Noise-Robust Speech Recognition
van Dalen, Rogier C.
Gales, Mark J. F.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
[48] Noise robust automatic speech recognition: review and analysis
Dua M.
Akanksha
Dua S.
International Journal of Speech Technology, 2023, 26 (02) : 475 - 519
[49] Noise robust speech recognition with state duration constraints
Laurila, K
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 871 - 874
[50] An overview of noise-robust automatic speech recognition
Li, Jinyu
Deng, Li
Gong, Yifan
Haeb-Umbach, Reinhold
IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777

← 1 2 3 4 5 →