Missing data techniques for robust speech recognition

被引：0

作者：

Cooke, M

Morris, A

Green, P

机构：

来源：

1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS | 1997年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In noisy listening conditions, the information available on which to base speech recognition decisions is necessarily incomplete: some spectro-temporal regions are dominated by other sources. We report on the application of a variety of techniques for missing data in speech recognition. These techniques may be based on marginal distributions or on reconstruction of missing parts of the spectrum. Application of these ideas in the Resource Management task shows performance which is robust to random removal of up to 80% of the frequency channels, but falls off rapidly with deletions which more realistically simulate masked speech. We report on a vowel classification experiment designed to isolate some of the RM problems for more detailed exploration. The results of this experiment confirm the general superiority of marginals-based schemes, demonstrate the viability of shared covariance statistics, and suggest several ways in which performance improvements on the larger task may be obtained.

引用

页码：863 / 866

页数：4

共 50 条

[31] Handling Convolutional Noise in Missing Data Automatic Speech Recognition
Van Segbroeck, Maarten
Van Hamme, Hugo
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2562 - 2565
[32] Speech recognition with missing data using recurrent neural nets
Parveen, S
Green, PD
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1189 - 1195
[33] Study of Robust Feature Extraction Techniques for Speech Recognition System
Sharma, Usha
Maheshkar, Sushila
Mishra, A. N.
[J]. 2015 1ST INTERNATIONAL CONFERENCE ON FUTURISTIC TRENDS ON COMPUTATIONAL ANALYSIS AND KNOWLEDGE MANAGEMENT (ABLAZE), 2015, : 666 - 670
[34] Improved modulation spectrum normalization techniques for robust speech recognition
Pan, Chi-an
Wang, Chieh-cheng
Hung, Jeih-weih
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4089 - 4092
[35] Robust Submodular Data Partitioning for Distributed Speech Recognition
Qi, Jun
Tejedor, Javier
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2254 - 2258
[36] EXPLOITING MULTIMODAL DATA FUSION IN ROBUST SPEECH RECOGNITION
Heracleous, Panikos
Badin, Pierre
Bailly, Gerard
Hagita, Norihiro
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 568 - 572
[37] Missing feature theory applied to robust speech recognition over IP network
Endo, T
Kuroiwa, S
Nakamura, S
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1119 - 1126
[38] Investigation of Data Augmentation Techniques for Disordered Speech Recognition
Geng, Mengzhe
Xie, Xurong
Liu, Shansong
Yu, Jianwei
Hu, Shoukang
Liu, Xunying
Meng, Helen
[J]. INTERSPEECH 2020, 2020, : 696 - 700
[39] Mask estimation based on sound localisation for missing data speech recognition
Harding, S
Barker, J
Brown, GJ
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 537 - 540
[40] On noise masking for automatic missing data speech recognition: A survey and discussion
Cerisara, Christophe
Demange, Sebastien
Haton, Jean-Paul
[J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 443 - 457

← 1 2 3 4 5 →