Noise-robust speech recognition using a new spectral estimation method "PHASOR"

被引:0
|
作者
Aikawa, K [1 ]
Ishizuka, K [1 ]
机构
[1] NTT Corp, Commun Sci Labs, Atsugi, Kanagawa 2430198, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a new noise-robust spectral estimation method for speech recognition. The new method, called PHASOR, is characterized by inside-frame processing. The speech spectrum is estimated from a single impulse response obtained by summing multiple pitch periods in a frame with synchronizing the phase. PHASOR improves the spectral estimation accuracy and suppresses the additive noise because of the inside-frame processing. These improvement is more effective when the pitch fluctuates or changes in the frame. Speaker-dependent and speaker-independent phoneme recognition experiments demonstrate that the PHASOR greatly reduces the recognition error rate for speech data contaminated by noise. It also outperforms conventional noise reduction methods, cepstral mean normalization and spectral subtraction.
引用
收藏
页码:397 / 400
页数:4
相关论文
共 50 条
  • [1] A speech emphasis method for noise-robust speech recognition by using repetitive phrase
    Hirai, Takanori
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    Fattah, Mohamed Abdel
    [J]. 2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 1269 - +
  • [2] SPECTRAL ESTIMATION FOR NOISE ROBUST SPEECH RECOGNITION
    ERELL, A
    WEINTRAUB, M
    [J]. SPEECH AND NATURAL LANGUAGE, 1989, : 319 - 324
  • [3] A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks
    Li, Bo
    Sim, Khe Chai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (08) : 1296 - 1305
  • [4] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [5] Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition
    Zhao, Yong
    Juang, Biing-Hwang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2191 - 2206
  • [6] EXTENDED VTS FOR NOISE-ROBUST SPEECH RECOGNITION
    van Dalen, R. C.
    Gales, M. J. F.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3829 - 3832
  • [7] Covariance Modelling for Noise-Robust Speech Recognition
    van Dalen, R. C.
    Gales, M. J. F.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2000 - 2003
  • [8] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [9] Extended VTS for Noise-Robust Speech Recognition
    van Dalen, Rogier C.
    Gales, Mark J. F.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
  • [10] Frame decorrelation for noise-robust speech recognition
    Jung, HY
    Kim, DY
    Un, CK
    [J]. ELECTRONICS LETTERS, 1996, 32 (13) : 1163 - 1164