AMPLITUDE MODULATION SPECTROGRAM BASED FEATURES FOR ROBUST SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS

被引:0
|
作者
Moritz, Niko [1 ]
Anemueller, Joern [2 ]
Kollmeier, Birger [1 ]
机构
[1] Fraunhofer IDMT Project Grp Hearing Speech & Audi, Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Med Phys, Dept Phys, Oldenburg, Germany
关键词
Amplitude Modulation Spectrogram (AMS); Feature Extraction; Reverberation; Phase; Automatic Speech Recognition (ASR); RECEPTION; SPECTRUM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this contribution we present a feature extraction method that relies on the modulation-spectral analysis of amplitude fluctuations within sub-bands of the acoustic spectrum by a STFT. The experimental results indicate that the optimal temporal filter extension for amplitude modulation analysis is around 310 ms. It is also demonstrated that the phase information of the modulation spectrum contains important cues for speech recognition. In this context, the advantage of an odd analysis basis function is considered. The best presented features reached a total relative improvement of 53,5 % for clean-condition training on Aurora-2. Furthermore, it is shown that modulation features are more robust against room reverberation than conventional cepstral and dynamic features and that they strongly benefit from a high early-to-late energy ratio of the characteristic RIR.
引用
收藏
页码:5492 / 5495
页数:4
相关论文
共 50 条
  • [1] SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
    Schwarz, Andreas
    Huemmer, Christian
    Maas, Roland
    Kellermann, Walter
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4380 - 4384
  • [2] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    [J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [3] ROBUST SPEECH RECOGNITION IN UNKNOWN REVERBERANT AND NOISY CONDITIONS
    Hsiao, Roger
    Ma, Jeff
    Hartmann, William
    Karafiat, Martin
    Grezl, Frantisek
    Burget, Lukas
    Szoke, Igor
    Cernocky, Jan Honza
    Watanabe, Shinji
    Chen, Zhuo
    Mallidi, Sri Harish
    Hermansky, Hynek
    Tsakalidis, Stavros
    Schwartz, Richard
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 533 - 538
  • [4] Techniques for robust speech recognition in noisy and reverberant conditions
    Brown, GJ
    Palomäki, KJ
    [J]. SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 213 - 220
  • [5] Robust speech recognition using the modulation spectrogram
    Kingsbury, BED
    Morgan, N
    Greenberg, S
    [J]. SPEECH COMMUNICATION, 1998, 25 (1-3) : 117 - 132
  • [6] EXEMPLAR-BASED NOISE ROBUST AUTOMATIC SPEECH RECOGNITION USING MODULATION SPECTROGRAM FEATURES
    Baby, Deepak
    Virtanen, Tuomas
    Gemmeke, Jort F.
    Barker, Tom
    Van Hamme, Hugo
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 519 - 524
  • [7] ROBUST RECOGNITION OF REVERBERANT AND NOISY SPEECH USING COHERENCE-BASED PROCESSING
    Menon, Anjali
    Kim, Chanwoo
    Stern, Richard M.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6775 - 6779
  • [8] RESTORATION OF INSTANTANEOUS AMPLITUDE AND PHASE OF SPEECH SIGNAL IN NOISY REVERBERANT ENVIRONMENTS
    Liu, Yang
    Nower, Naushin
    Yan, Yonghong
    Unoki, Masashi
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 879 - 883
  • [9] Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments
    Liu, Yang
    Nower, Naushin
    Morita, Shota
    Unoki, Masashi
    [J]. SPEECH COMMUNICATION, 2016, 84 : 1 - 14
  • [10] Robust automatic speech recognition based on neural network in reverberant environments
    Bai, L.
    Li, H. L.
    He, Y. Y.
    [J]. CIVIL, ARCHITECTURE AND ENVIRONMENTAL ENGINEERING, VOLS 1 AND 2, 2017, : 1319 - 1324