Robust emotion recognition by spectro-temporal modulation statistic features

被引：0

作者：

Tai-Shih Chi

Lan-Ying Yeh

Chin-Cheng Hsu

机构：

[1] National Chiao Tung University,Department of Electrical Engineering

来源：

Journal of Ambient Intelligence and Humanized Computing | 2012年 / 3卷

关键词：

Robust emotion recognition; Spectro-temporal modulation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Most speech emotion recognition studies consider clean speech. In this study, statistics of joint spectro-temporal modulation features are extracted from an auditory perceptual model and are used to detect the emotion status of speech under noisy conditions. Speech samples were extracted from the Berlin Emotional Speech database and corrupted with white and babble noise under various SNR levels. This study investigates a clean train/noisy test scenario to simulate practical conditions with unknown noisy sources. Simulations demonstrate the redundancy of the proposed spectro-temporal modulation features and further consider the dimensionality reduction. The proposed modulation features achieve higher recognition rates of speech emotions under noisy conditions than (1) conventional mel-frequency cepstral coefficients combined with prosodic features; (2) official acoustic features adopted in the INTERSPEECH 2009 Emotion Challenge. Adding modulation features increased the recognition rates of INTERSPEECH proposed features by approximately 7% for all tested SNR conditions (20–0 dB).

引用

下载

页码：47 / 60

页数：13

共 50 条

[31] Hilbert Envelope Based Spectro-Temporal Features for Phoneme Recognition in Telephone Speech
Thomas, Samuel
Ganapathy, Sriram
Hermansky, Hynek
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1521 - +
[32] Spectro-Temporal Energy Ratio Features for Single-Corpus and Cross-Corpus Experiments in Speech Emotion Recognition
Parlak, Cevahir
Diri, Banu
Altun, Yusuf
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3209 - 3223
[33] Spectro-Temporal Energy Ratio Features for Single-Corpus and Cross-Corpus Experiments in Speech Emotion Recognition
Cevahir Parlak
Banu Diri
Yusuf Altun
Arabian Journal for Science and Engineering, 2024, 49 : 3209 - 3223
[34] Spectro-Temporal Features for Howling Frequency Detection
Lee, Jae-Won
Choi, Seung Ho
COMPUTER APPLICATIONS FOR WEB, HUMAN COMPUTER INTERACTION, SIGNAL AND IMAGE PROCESSING AND PATTERN RECOGNITION, 2012, 342 : 25 - +
[35] Nonnegative features of spectro-temporal sounds for classification
Cho, YC
Choi, SJ
PATTERN RECOGNITION LETTERS, 2005, 26 (09) : 1327 - 1336
[36] Spectro-temporal features for environmental sound classification
Thwe, Khine Zar
Thaw, Mie Mie
INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 20 (02) : 179 - 189
[37] Data-Driven and Feedback Based Spectro-Temporal Features for Speech Recognition
Sivaram, G. S. V. S.
Nemala, Sridhar Krishna
Mesgarani, Nima
Hermansky, Hynek
IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (11) : 957 - 960
[38] Improved Phoneme Recognition by Integrating Evidence from Spectro-temporal and Cepstral Features
Li, Shang-wen
Sun, Liang-che
Lee, Lin-shan
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1177 - 1180
[39] Characteristics of spectro-temporal modulation frequency selectivity in humans
Oetjen, Arne
Verhey, Jesko L.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (03): : 1887 - 1895
[40] Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases
Patil, Kailash
Elhilali, Mounya
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 13

← 1 2 3 4 5 →