Missing data techniques for robust speech recognition

被引:0
|
作者
Cooke, M
Morris, A
Green, P
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In noisy listening conditions, the information available on which to base speech recognition decisions is necessarily incomplete: some spectro-temporal regions are dominated by other sources. We report on the application of a variety of techniques for missing data in speech recognition. These techniques may be based on marginal distributions or on reconstruction of missing parts of the spectrum. Application of these ideas in the Resource Management task shows performance which is robust to random removal of up to 80% of the frequency channels, but falls off rapidly with deletions which more realistically simulate masked speech. We report on a vowel classification experiment designed to isolate some of the RM problems for more detailed exploration. The results of this experiment confirm the general superiority of marginals-based schemes, demonstrate the viability of shared covariance statistics, and suggest several ways in which performance improvements on the larger task may be obtained.
引用
收藏
页码:863 / 866
页数:4
相关论文
共 50 条
  • [31] Handling Convolutional Noise in Missing Data Automatic Speech Recognition
    Van Segbroeck, Maarten
    Van Hamme, Hugo
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2562 - 2565
  • [32] Speech recognition with missing data using recurrent neural nets
    Parveen, S
    Green, PD
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1189 - 1195
  • [33] Study of Robust Feature Extraction Techniques for Speech Recognition System
    Sharma, Usha
    Maheshkar, Sushila
    Mishra, A. N.
    [J]. 2015 1ST INTERNATIONAL CONFERENCE ON FUTURISTIC TRENDS ON COMPUTATIONAL ANALYSIS AND KNOWLEDGE MANAGEMENT (ABLAZE), 2015, : 666 - 670
  • [34] Improved modulation spectrum normalization techniques for robust speech recognition
    Pan, Chi-an
    Wang, Chieh-cheng
    Hung, Jeih-weih
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4089 - 4092
  • [35] Robust Submodular Data Partitioning for Distributed Speech Recognition
    Qi, Jun
    Tejedor, Javier
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2254 - 2258
  • [36] EXPLOITING MULTIMODAL DATA FUSION IN ROBUST SPEECH RECOGNITION
    Heracleous, Panikos
    Badin, Pierre
    Bailly, Gerard
    Hagita, Norihiro
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 568 - 572
  • [37] Missing feature theory applied to robust speech recognition over IP network
    Endo, T
    Kuroiwa, S
    Nakamura, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1119 - 1126
  • [38] Investigation of Data Augmentation Techniques for Disordered Speech Recognition
    Geng, Mengzhe
    Xie, Xurong
    Liu, Shansong
    Yu, Jianwei
    Hu, Shoukang
    Liu, Xunying
    Meng, Helen
    [J]. INTERSPEECH 2020, 2020, : 696 - 700
  • [39] Mask estimation based on sound localisation for missing data speech recognition
    Harding, S
    Barker, J
    Brown, GJ
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 537 - 540
  • [40] On noise masking for automatic missing data speech recognition: A survey and discussion
    Cerisara, Christophe
    Demange, Sebastien
    Haton, Jean-Paul
    [J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 443 - 457