A Closer Look on Hierarchical Spectro-Temporal Features (HIST)

被引：0

作者：

Heckmann, Martin ^{[1
]}

Domont, Xavier ^{[1
]}

Joublin, Frank ^{[1
]}

Goerick, Christian ^{[1
]}

机构：

[1] Honda Res Inst Europe GmbH, D-63073 Offenbach, Germany

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Spectro-temporal; auditory; robust speech recognition; non-linear smoothing;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech recognition robust against interfering noise remains a difficult task. We previously presented a set of spectro-temporal speech features which we termed Hierarchical Spectro-Temporal (HIST) features showing improved robustness, especially when combined with RASTA-PLP. They are inspired by the receptive fields found in the mammalian auditory cortex and are organized in two hierarchical levels. A set of filters learned via ICA captures local variations and constitutes the first layer of the hierarchy. In the second layer these local variations are combined to form larger receptive fields learned via Non Negative Sparse Coding. In this paper we introduce a non-linear smoothing along the time axis of the spectrograms at the input to the hierarchy and, additionally, a more thorough performance analysis on an isolated and a continuous digit recognition task. The results show that the combination of HIST and RASTA-PLP features yields improved recognition scores in noise.

引用

下载

页码：894 / 897

页数：4

共 50 条

[1] Hierarchical spectro-temporal features for robust speech recognition
Domont, Xavier
Heckmann, Martin
Joublin, Frank
Goerick, Christian
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4417 - 4420
[2] A hierarchical framework for spectro-temporal feature extraction
Heckmann, Martin
Domont, Xavier
Joublin, Frank
Goerick, Christian
SPEECH COMMUNICATION, 2011, 53 (05) : 736 - 752
[3] Development of spectro-temporal features of speech in children
Gautam S.
Singh L.
Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
[4] SPECTRO-TEMPORAL GABOR FEATURES FOR SPEAKER RECOGNITION
Lei, Howard
Meyer, Bernd T.
Mirghafori, Nikki
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4241 - 4244
[5] Spectro-Temporal Features for Howling Frequency Detection
Lee, Jae-Won
Choi, Seung Ho
COMPUTER APPLICATIONS FOR WEB, HUMAN COMPUTER INTERACTION, SIGNAL AND IMAGE PROCESSING AND PATTERN RECOGNITION, 2012, 342 : 25 - +
[6] Nonnegative features of spectro-temporal sounds for classification
Cho, YC
Choi, SJ
PATTERN RECOGNITION LETTERS, 2005, 26 (09) : 1327 - 1336
[7] Spectro-temporal features for environmental sound classification
Thwe, Khine Zar
Thaw, Mie Mie
INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 20 (02) : 179 - 189
[8] Comparing Different Flavors of Spectro-Temporal Features for ASR
Meyer, Bernd T.
Ravuri, Suman V.
Schaedler, Marc Rene
Morgan, Nelson
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
[9] Robust emotion recognition by spectro-temporal modulation statistic features
Tai-Shih Chi
Lan-Ying Yeh
Chin-Cheng Hsu
Journal of Ambient Intelligence and Humanized Computing, 2012, 3 : 47 - 60
[10] Spectro-temporal Power Spectrum Features for Noise Robust ASR
Hamed Riazati Seresht
Seyed Mohammad Ahadi
Sanaz Seyedin
Circuits, Systems, and Signal Processing, 2017, 36 : 3222 - 3242

← 1 2 3 4 5 →