A Closer Look on Hierarchical Spectro-Temporal Features (HIST)

被引:0
|
作者
Heckmann, Martin [1 ]
Domont, Xavier [1 ]
Joublin, Frank [1 ]
Goerick, Christian [1 ]
机构
[1] Honda Res Inst Europe GmbH, D-63073 Offenbach, Germany
关键词
Spectro-temporal; auditory; robust speech recognition; non-linear smoothing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech recognition robust against interfering noise remains a difficult task. We previously presented a set of spectro-temporal speech features which we termed Hierarchical Spectro-Temporal (HIST) features showing improved robustness, especially when combined with RASTA-PLP. They are inspired by the receptive fields found in the mammalian auditory cortex and are organized in two hierarchical levels. A set of filters learned via ICA captures local variations and constitutes the first layer of the hierarchy. In the second layer these local variations are combined to form larger receptive fields learned via Non Negative Sparse Coding. In this paper we introduce a non-linear smoothing along the time axis of the spectrograms at the input to the hierarchy and, additionally, a more thorough performance analysis on an isolated and a continuous digit recognition task. The results show that the combination of HIST and RASTA-PLP features yields improved recognition scores in noise.
引用
下载
收藏
页码:894 / 897
页数:4
相关论文
共 50 条
  • [1] Hierarchical spectro-temporal features for robust speech recognition
    Domont, Xavier
    Heckmann, Martin
    Joublin, Frank
    Goerick, Christian
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4417 - 4420
  • [2] A hierarchical framework for spectro-temporal feature extraction
    Heckmann, Martin
    Domont, Xavier
    Joublin, Frank
    Goerick, Christian
    SPEECH COMMUNICATION, 2011, 53 (05) : 736 - 752
  • [3] Development of spectro-temporal features of speech in children
    Gautam S.
    Singh L.
    Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
  • [4] SPECTRO-TEMPORAL GABOR FEATURES FOR SPEAKER RECOGNITION
    Lei, Howard
    Meyer, Bernd T.
    Mirghafori, Nikki
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4241 - 4244
  • [5] Spectro-Temporal Features for Howling Frequency Detection
    Lee, Jae-Won
    Choi, Seung Ho
    COMPUTER APPLICATIONS FOR WEB, HUMAN COMPUTER INTERACTION, SIGNAL AND IMAGE PROCESSING AND PATTERN RECOGNITION, 2012, 342 : 25 - +
  • [6] Nonnegative features of spectro-temporal sounds for classification
    Cho, YC
    Choi, SJ
    PATTERN RECOGNITION LETTERS, 2005, 26 (09) : 1327 - 1336
  • [7] Spectro-temporal features for environmental sound classification
    Thwe, Khine Zar
    Thaw, Mie Mie
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 20 (02) : 179 - 189
  • [8] Comparing Different Flavors of Spectro-Temporal Features for ASR
    Meyer, Bernd T.
    Ravuri, Suman V.
    Schaedler, Marc Rene
    Morgan, Nelson
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
  • [9] Robust emotion recognition by spectro-temporal modulation statistic features
    Tai-Shih Chi
    Lan-Ying Yeh
    Chin-Cheng Hsu
    Journal of Ambient Intelligence and Humanized Computing, 2012, 3 : 47 - 60
  • [10] Spectro-temporal Power Spectrum Features for Noise Robust ASR
    Hamed Riazati Seresht
    Seyed Mohammad Ahadi
    Sanaz Seyedin
    Circuits, Systems, and Signal Processing, 2017, 36 : 3222 - 3242