Cepstral compensation by polynomial approximation for environment-independent speech recognition

被引:0
|
作者
Raj, B
Gouvea, EB
Moreno, PJ
Stern, RM
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech recognition systems perform poorly on speech degraded by even simple effects such as linear filtering and additive noise. One possible solution to this problem is to modify the probability density function (PDF) of clean speech to account for the effects of the degradation. However, even for the case of linear filtering and additive noise, it is extremely difficult to do this analytically. Previously attempted analytical solutions to the problem of noisy speech recognition have either used an overly-simplified mathematical description of the effects of noise on the statistics of speech, or they have relied on the availability of large environment-specific adaptation sets. Some of the previous methods required the use of adaptation data that consists of simultaneously-recorded or ''stereo'' recordings of clean and degraded speech. In this paper we introduce an approximation-based method to compute the effects of the environment on the parameters of the PDF of clean speech. In this work, we perform compensation by Vector Polynomial approximationS (VPS) for the effects of linear filtering and additive noise on the clean speech. We also estimate the parameters of the environment, namely the noise and the channel, by using piecewise-linear approximations of these effects. We evaluate the performance of this method (VPS) using the CMU SPHINX-II system and the 100-word alphanumeric CENSUS database. Performance is evaluated at several SNRs, with artificial white Gaussian noise added to the database. VPS provides improvements of up to 15 percent in relative recognition accuracy.
引用
收藏
页码:2340 / 2343
页数:4
相关论文
共 50 条
  • [1] A vector Taylor series approach for environment-independent speech recognition
    Moreno, PJ
    Raj, B
    Stern, RM
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 733 - 736
  • [2] Environment-independent continuous speech recognition using neural networks and hidden Markov models
    Yuk, DS
    Che, CW
    Jin, LM
    Lin, QG
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3358 - 3361
  • [3] CEPSTRAL DOMAIN TALKER STRESS COMPENSATION FOR ROBUST SPEECH RECOGNITION
    CHEN, YN
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1988, 36 (04): : 433 - 439
  • [4] Perceptual harmonic cepstral coefficients for speech recognition in noisy environment
    Gu, L
    Rose, K
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 125 - 128
  • [5] Environment-Independent VR Development
    Kreylos, Oliver
    ADVANCES IN VISUAL COMPUTING, PT I, PROCEEDINGS, 2008, 5358 : 901 - 912
  • [6] ENVIRONMENT-INDEPENDENT WI-FI HUMAN ACTIVITY RECOGNITION WITH ADVERSARIAL NETWORK
    Wang, Zhengyang
    Chen, Sheng
    Yang, Wei
    Xu, Yang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3330 - 3334
  • [7] Cepstral behaviour due to additive noise and a compensation scheme for noisy speech recognition
    Hwang, TH
    Lee, LM
    Wang, HC
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1998, 145 (05): : 316 - 321
  • [8] Towards Environment-independent Human Activity Recognition using Deep Learning and Enhanced CSI
    Shi, Zhenguo
    Zhang, J. Andrew
    Xu, Richard
    Cheng, Qingqing
    Pearce, Andre
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [9] Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition
    Skowronski, MD
    Harris, JG
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 116 (03): : 1774 - 1780
  • [10] DA-HAR: Dual adversarial network for environment-independent WiFi human activity recognition
    Sheng, Long
    Chen, Yue
    Ning, Shuli
    Wang, Shengpeng
    Lian, Bin
    Wei, Zhongcheng
    PERVASIVE AND MOBILE COMPUTING, 2023, 96