Multi-modal Emotion Recognition Based on Speech and Image

被引:2
|
作者
Li, Yongqiang [1 ,2 ]
He, Qi [1 ]
Zhao, Yongping [1 ]
Yao, Hongxun [2 ]
机构
[1] Harbin Inst Technol, Sch Elect Engn & Automat, Harbin 150001, Heilongjiang, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal emotion recognition; Feature level fusion; Decision level fusion; SPECTRAL FEATURES;
D O I
10.1007/978-3-319-77380-3_81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For the past two decades emotion recognition has gained great attention because of huge potential in many applications. Most works in this field try to recognize emotion from single modal such as image or speech. Recently, there are some studies investigating emotion recognition from multi-modal, i.e., speech and image. The information fusion strategy is a key point for multi-modal emotion recognition, which can be grouped into two main categories: feature level fusion and decision level fusion. This paper explores the emotion recognition from multi-modal, i.e., speech and image. We make a systemic and detailed comparison among several feature level fusion methods and decision level fusion methods such as PCA based feature fusion, LDA based feature fusion, product rule based decision fusion, mean rule based decision fusion and so on. We test all the compared methods on the Surrey Audio-Visual Expressed Emotion (SAVEE) Database. The experimental results demonstrate that emotion recognition based on fusion of speech and image achieved high recognition accuracy than emotion recognition from single modal, and also the decision level fusion methods show superior to feature level fusion methods in this work.
引用
收藏
页码:844 / 853
页数:10
相关论文
共 50 条
  • [1] Multi-modal Attention for Speech Emotion Recognition
    Pan, Zexu
    Luo, Zhaojie
    Yang, Jichen
    Li, Haizhou
    [J]. INTERSPEECH 2020, 2020, : 364 - 368
  • [2] Multi-modal Correlated Network for emotion recognition in speech
    Ren, Minjie
    Nie, Weizhi
    Liu, Anan
    Su, Yuting
    [J]. VISUAL INFORMATICS, 2019, 3 (03) : 150 - 155
  • [3] Multi-modal emotion recognition using EEG and speech signals
    Wang, Qian
    Wang, Mou
    Yang, Yan
    Zhang, Xiaolei
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149
  • [4] Multi-modal Emotion Recognition Based on Hypergraph
    Zong, Lin-Lin
    Zhou, Jia-Hui
    Xie, Qiu-Jie
    Zhang, Xian-Chao
    Xu, Bo
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (12): : 2520 - 2534
  • [5] Contextual and Cross-Modal Interaction for Multi-Modal Speech Emotion Recognition
    Yang, Dingkang
    Huang, Shuai
    Liu, Yang
    Zhang, Lihua
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2093 - 2097
  • [6] Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning
    Liu, Dong
    Wang, Zhiyong
    Wang, Lifeng
    Chen, Longxi
    [J]. FRONTIERS IN NEUROROBOTICS, 2021, 15
  • [7] Multi-Modal Emotion Recognition From Speech and Facial Expression Based on Deep Learning
    Cai, Linqin
    Dong, Jiangong
    Wei, Min
    [J]. 2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5726 - 5729
  • [8] Multi-modal Emotion Recognition using Speech Features and Text Embedding
    Kim, Ju-Hee
    Lee, Seok-Pil
    [J]. Transactions of the Korean Institute of Electrical Engineers, 2021, 70 (01): : 108 - 113
  • [9] Lightweight multi-modal emotion recognition model based on modal generation
    Liu, Peisong
    Che, Manqiang
    Luo, Jiangchuan
    [J]. 2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 430 - 435
  • [10] Multi-Modal Emotion Recognition Combining Face Image and EEG Signal
    Hu, Ying
    Wang, Feng
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (07)