Delta-MelSpectra Features for Noise Robustness to DNN-based ASR systems

被引：0

作者：

Kumar, Kshitiz ^{[1
]}

Liu, Chaojun ^{[1
]}

Gong, Yifan ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

Speech recognition; denoising; delta-features; temporal-difference; DNNs; nonlinearity;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep-neural-networks (DNNs) have significantly improved automatic speech recognition (ASR) accuracy over a range of speech scenarios. However noise-robustness is still a challenge to DNNs, where compared to clean, accuracy degrades significantly for noisy environments. Many of the current DNN-based ASR engines use log-MelSpectra features, along with features from temporal-difference in delta and delta-delta features. In this work we introduce delta-MelSpectra features to seek significant gains for DNNs in noisy environments, where we demonstrate that temporal-difference directly in MelSpectra domain can provide superior noise-robust features. We validate our delta-MelSpectra features over a multistyle trained DNN-ASR system; we tested on a large scale WindowsPhone client data, and obtained 17% and 12% relative reduction in word-error-rate (WER) for noisy and clean environments, respectively.

引用

页码：2445 / 2448

页数：4

共 50 条

[21] DNN-Based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction
Haichuan Bai
Fengpei Ge
Yonghong Yan
中国通信, 2018, 15 (09) : 235 - 243
[22] Uncertainty decoding with adaptive sampling for noise robust DNN-based acoustic modeling
Tran, Dung T.
Delcroix, Marc
Ogawa, Atsunori
Nakatani, Tomohiro
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3852 - 3856
[23] DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification
Tan, Zhili
Mak, Man-Wai
Mak, Brian Kan-Wing
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 700 - 712
[24] DNN-Based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction
Bai, Haichuan
Ge, Fengpei
Yan, Yonghong
CHINA COMMUNICATIONS, 2018, 15 (09) : 235 - 243
[25] SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
Schwarz, Andreas
Huemmer, Christian
Maas, Roland
Kellermann, Walter
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4380 - 4384
[26] ASR systems in Noisy Environment: Analysis and Solutions for Increasing Noise Robustness
Rajnoha, Josef
Pollak, Petr
RADIOENGINEERING, 2011, 20 (01) : 74 - 84
[27] DNN Uncertainty Propagation Using GMM-Derived Uncertainty Features for Noise Robust ASR
Nathwani, Karan
Vincent, Emmanuel
Illina, Irina
IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (03) : 338 - 342
[28] A DNN-Based Channel Model for Network Planning in Train Control Systems
Wen, Tao
Xie, Guo
Cao, Yuan
Cai, Baigen
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (03) : 2392 - 2399
[29] DNN-Based Brain MRI Classification Using Fuzzy Clustering and Autoencoder Features
Chauhan, Nishant
Choi, Byung-Jae
INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2021, 21 (04) : 349 - 357
[30] Collaborative Inference in DNN-based Satellite Systems with Dynamic Task Streams
Guan, Jinglong
Zhang, Qiyang
Murturi, Ilir
Donta, Praveen Kumar
Dustdar, Schahram
Wang, Shangguang
ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 3803 - 3808

← 1 2 3 4 5 →