Emotion Classification from Speech and Text in Videos Using a Multimodal Approach

被引:8
|
作者
Caschera, Maria Chiara [1 ]
Grifoni, Patrizia [1 ]
Ferri, Fernando [1 ]
机构
[1] CNR, Inst Res Populat & Social Policies CNR IRPPS, Via Palestro 32, I-00185 Rome, Italy
关键词
emotion classification; multimodal interaction; hidden Markov models; SENTIMENT ANALYSIS; RECOGNITION; FEATURES; MODELS; ALGORITHMS; FUSION; AUDIO; SVM;
D O I
10.3390/mti6040028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion classification is a research area in which there has been very intensive literature production concerning natural language processing, multimedia data, semantic knowledge discovery, social network mining, and text and multimedia data mining. This paper addresses the issue of emotion classification and proposes a method for classifying the emotions expressed in multimodal data extracted from videos. The proposed method models multimodal data as a sequence of features extracted from facial expressions, speech, gestures, and text, using a linguistic approach. Each sequence of multimodal data is correctly associated with the emotion by a method that models each emotion using a hidden Markov model. The trained model is evaluated on samples of multimodal sentences associated with seven basic emotions. The experimental results demonstrate a good classification rate for emotions.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] A multimodal hierarchical approach to speech emotion recognition from audio and text
    Singh, Prabhav
    Srivastava, Ridam
    Rana, K. P. S.
    Kumar, Vineet
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 229
  • [2] MULTIMODAL SPEECH EMOTION RECOGNITION USING AUDIO AND TEXT
    Yoon, Seunghyun
    Byun, Seokhyun
    Jung, Kyomin
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 112 - 118
  • [3] Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
    Skowroński, Kamil
    Galuszka, Adam
    Probierz, Eryka
    [J]. Applied Sciences (Switzerland), 2024, 14 (22):
  • [4] Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
    Lee, Yoonhyung
    Yoon, Seunghyun
    Jung, Kyomin
    [J]. INTERSPEECH 2020, 2020, : 2717 - 2721
  • [5] Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data
    Lee, Chan Woo
    Song, Kyu Ye
    Jeong, Jihoon
    Choi, Woo Yong
    [J]. FIRST GRAND CHALLENGE AND WORKSHOP ON HUMAN MULTIMODAL LANGUAGE (CHALLENGE-HML), 2018, : 28 - 34
  • [6] Hybrid Approach for Emotion Classification of Audio Conversation Based on Text and Speech Mining
    Bhaskar, Jasmine
    Sruthi, K.
    Nedungadi, Prema
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 : 635 - 643
  • [7] Emotion Recognition from Videos Using Multimodal Large Language Models
    Vaiani, Lorenzo
    Cagliero, Luca
    Garza, Paolo
    [J]. FUTURE INTERNET, 2024, 16 (07)
  • [8] Multimodal speech emotion recognition and classification using convolutional neural network techniques
    A. Christy
    S. Vaithyasubramanian
    A. Jesudoss
    M. D. Anto Praveena
    [J]. International Journal of Speech Technology, 2020, 23 : 381 - 388
  • [9] Multimodal speech emotion recognition and classification using convolutional neural network techniques
    Christy, A.
    Vaithyasubramanian, S.
    Jesudoss, A.
    Praveena, M. D. Anto
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 381 - 388
  • [10] Multimodal Fusion of Speech and Text using Semi-supervised LDA for Indexing Lecture Videos
    Husain, Moula
    Meena, S. M.
    [J]. 2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2019,