Robust emotion recognition by spectro-temporal modulation statistic features

被引:0
|
作者
Tai-Shih Chi
Lan-Ying Yeh
Chin-Cheng Hsu
机构
[1] National Chiao Tung University,Department of Electrical Engineering
关键词
Robust emotion recognition; Spectro-temporal modulation;
D O I
暂无
中图分类号
学科分类号
摘要
Most speech emotion recognition studies consider clean speech. In this study, statistics of joint spectro-temporal modulation features are extracted from an auditory perceptual model and are used to detect the emotion status of speech under noisy conditions. Speech samples were extracted from the Berlin Emotional Speech database and corrupted with white and babble noise under various SNR levels. This study investigates a clean train/noisy test scenario to simulate practical conditions with unknown noisy sources. Simulations demonstrate the redundancy of the proposed spectro-temporal modulation features and further consider the dimensionality reduction. The proposed modulation features achieve higher recognition rates of speech emotions under noisy conditions than (1) conventional mel-frequency cepstral coefficients combined with prosodic features; (2) official acoustic features adopted in the INTERSPEECH 2009 Emotion Challenge. Adding modulation features increased the recognition rates of INTERSPEECH proposed features by approximately 7% for all tested SNR conditions (20–0 dB).
引用
下载
收藏
页码:47 / 60
页数:13
相关论文
共 50 条
  • [31] Hilbert Envelope Based Spectro-Temporal Features for Phoneme Recognition in Telephone Speech
    Thomas, Samuel
    Ganapathy, Sriram
    Hermansky, Hynek
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1521 - +
  • [32] Spectro-Temporal Energy Ratio Features for Single-Corpus and Cross-Corpus Experiments in Speech Emotion Recognition
    Parlak, Cevahir
    Diri, Banu
    Altun, Yusuf
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3209 - 3223
  • [33] Spectro-Temporal Energy Ratio Features for Single-Corpus and Cross-Corpus Experiments in Speech Emotion Recognition
    Cevahir Parlak
    Banu Diri
    Yusuf Altun
    Arabian Journal for Science and Engineering, 2024, 49 : 3209 - 3223
  • [34] Spectro-Temporal Features for Howling Frequency Detection
    Lee, Jae-Won
    Choi, Seung Ho
    COMPUTER APPLICATIONS FOR WEB, HUMAN COMPUTER INTERACTION, SIGNAL AND IMAGE PROCESSING AND PATTERN RECOGNITION, 2012, 342 : 25 - +
  • [35] Nonnegative features of spectro-temporal sounds for classification
    Cho, YC
    Choi, SJ
    PATTERN RECOGNITION LETTERS, 2005, 26 (09) : 1327 - 1336
  • [36] Spectro-temporal features for environmental sound classification
    Thwe, Khine Zar
    Thaw, Mie Mie
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 20 (02) : 179 - 189
  • [37] Data-Driven and Feedback Based Spectro-Temporal Features for Speech Recognition
    Sivaram, G. S. V. S.
    Nemala, Sridhar Krishna
    Mesgarani, Nima
    Hermansky, Hynek
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (11) : 957 - 960
  • [38] Improved Phoneme Recognition by Integrating Evidence from Spectro-temporal and Cepstral Features
    Li, Shang-wen
    Sun, Liang-che
    Lee, Lin-shan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1177 - 1180
  • [39] Characteristics of spectro-temporal modulation frequency selectivity in humans
    Oetjen, Arne
    Verhey, Jesko L.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (03): : 1887 - 1895
  • [40] Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases
    Patil, Kailash
    Elhilali, Mounya
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015, : 1 - 13