SPATIO-TEMPORAL CONTEXT MODELLING FOR SPEECH EMOTION CLASSIFICATION

被引:0
|
作者
Jalal, Md Asif [1 ]
Moore, Roger K. [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Speech & Hearing Res Grp SPandH, Sheffield, S Yorkshire, England
关键词
Emotion classification; SER; Deep Neural Networks; Convolutional Neural Network; Attention Network;
D O I
10.1109/asru46091.2019.9004037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech emotion recognition (SER) is a requisite for emotional intelligence that affects the understanding of speech. One of the most crucial tasks is to obtain patterns having a maximum correlation for the emotion classification task from the speech signal while being invariant to the changes in frequency, time and other external distortions. Therefore, learning emotional contextual feature representation independent of speaker and environment is essential. In this paper, a novel spatiotemporal context modelling framework for robust SER is proposed to learn feature representation by using acoustic context expansion with high dimensional feature projection. The framework uses a deep convolutional neural network (CNN) and self-attention network. The CNNs combine spatiotemporal features. The attention network produces high dimensional task-specific features and combines these features for context modelling, which altogether provides a state-of-the-art technique for classifying the extracted patterns for speech emotion. Speech emotion is a categorical perception representing discrete sensory events. The proposed approach is compared with a wide range of architectures on the RAVDESS and IEMOCAP corpora for 8-class and 4-class emotion classification tasks and remarkable gain over state-of-the-art systems are obtained, absolutely 15%, 10% respectively.
引用
收藏
页码:853 / 859
页数:7
相关论文
共 50 条
  • [21] Spatio-Temporal Context Kernel for Activity Recognition
    Yuan, Fei
    Sahbi, Hichem
    Prinet, Veronique
    [J]. 2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 436 - 440
  • [22] An Improved Spatio-temporal Context Tracking Algorithm
    Wan, Hao
    Li, Weiguang
    Ye, Guoqiang
    [J]. PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 1320 - 1325
  • [23] A spatio-temporal architecture for context aware sensing
    Thiemjarus, Surapa
    Lo, Benny
    Yang, Guang-Zhong
    [J]. BSN 2006: INTERNATIONAL WORKSHOP ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS, PROCEEDINGS, 2006, : 191 - +
  • [24] Conversation Group Detection With Spatio-Temporal Context
    Tan, Stephanie
    Tax, David M. J.
    Hung, Hayley
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 170 - 180
  • [25] Deep spatio-temporal features for multimodal emotion recognition
    Nguyen, Dung
    Nguyen, Kien
    Sridharan, Sridha
    Ghasemi, Afsane
    Dean, David
    Fookes, Clinton
    [J]. 2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 1215 - 1223
  • [26] Spatio-Temporal Context Tracking with Color Attributes
    Xu, Bo
    Wang, Zhenhai
    Kang, Yuyun
    Wang, Yulan
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017, : 717 - 721
  • [27] Spatio-temporal information systems in a statistical context
    Tininini, L
    Paolucci, M
    Sindoni, G
    De Francisci, S
    [J]. ADVANCES IN DATABASE TECHNOLOGY - EDBT 2002, 2002, 2287 : 307 - 316
  • [28] Spatio-temporal context for robust multitarget tracking
    Nguyen, Hieu T.
    Ji, Qiang
    Smeulders, Arnold W. M.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (01) : 52 - 64
  • [29] On Spatio-Temporal Modelling of Stream Network Initiation
    Papageorgaki I.
    Nalbantis I.
    [J]. Environmental Processes, 2018, 5 (Suppl 1) : 239 - 257
  • [30] Spatio-temporal modelling of the status of groundwater droughts
    Marchant, B. P.
    Bloomfield, J. P.
    [J]. JOURNAL OF HYDROLOGY, 2018, 564 : 397 - 413