Robust emotion recognition by spectro-temporal modulation statistic features

被引：0

作者：

Tai-Shih Chi

Lan-Ying Yeh

Chin-Cheng Hsu

机构：

[1] National Chiao Tung University,Department of Electrical Engineering

来源：

Journal of Ambient Intelligence and Humanized Computing | 2012年 / 3卷

关键词：

Robust emotion recognition; Spectro-temporal modulation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Most speech emotion recognition studies consider clean speech. In this study, statistics of joint spectro-temporal modulation features are extracted from an auditory perceptual model and are used to detect the emotion status of speech under noisy conditions. Speech samples were extracted from the Berlin Emotional Speech database and corrupted with white and babble noise under various SNR levels. This study investigates a clean train/noisy test scenario to simulate practical conditions with unknown noisy sources. Simulations demonstrate the redundancy of the proposed spectro-temporal modulation features and further consider the dimensionality reduction. The proposed modulation features achieve higher recognition rates of speech emotions under noisy conditions than (1) conventional mel-frequency cepstral coefficients combined with prosodic features; (2) official acoustic features adopted in the INTERSPEECH 2009 Emotion Challenge. Adding modulation features increased the recognition rates of INTERSPEECH proposed features by approximately 7% for all tested SNR conditions (20–0 dB).

引用

页码：47 / 60

页数：13

共 50 条

[21] ROBUST SPECTRO-TEMPORAL FEATURES BASED ON AUTOREGRESSIVE MODELS OF HILBERT ENVELOPES
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4286 - 4289
[22] Spectro-temporal modulation detection in children
Kirby, Benjamin J.
Browning, Jenna M.
Brennan, Marc A.
Spratford, Meredith
McCreery, Ryan W.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 138 (05): : EL465 - EL468
[23] Robust Dialect Identification System using Spectro-Temporal Gabor Features
Chittaragi, Nagaratna B.
Mothukuri, Siva Krishna P.
Hegde, Pradyoth
Koolagudi, Shashidhar G.
PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1589 - 1594
[24] POINT PROCESS MODELS OF SPECTRO-TEMPORAL MODULATION EVENTS FOR SPEECH RECOGNITION
Jansen, Aren
Mesgarani, Nima
Niyogi, Partha
2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 104 - 108
[25] Phase based spectro-temporal features for building a robust ASR system
Dutta, Anirban
Ashishkumar, G.
Rao, Ch V. Rama
INTERSPEECH 2020, 2020, : 1668 - 1672
[26] Spectro-Temporal Features for Robust Far-Field Speaker Identification
Falk, Tiago H.
Chan, Wai-Yip
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 634 - 637
[27] Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems
Schaedler, Marc Rene
Kollmeier, Birger
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1810 - 1813
[28] Development of spectro-temporal features of speech in children
Gautam S.
Singh L.
Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
[29] Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition
Duc Hoang Ha Nguyen
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1006 - 1019
[30] Robust Spectro-Temporal Speech Features with Model-Based Distribution Equalization
Ngouoko, Samuel K. M.
Heckmann, Martin
Wrede, Britta
2013 14TH INTERNATIONAL WORKSHOP ON IMAGE ANALYSIS FOR MULTIMEDIA INTERACTIVE SERVICES (WIAMIS), 2013,

← 1 2 3 4 5 →