Audio-Video Based Multimodal Emotion Recognition Using SVMs and Deep Learning

被引:5
|
作者
Sun, Bo [1 ]
Xu, Qihua [1 ,2 ]
He, Jun [1 ]
Yu, Lejun [1 ]
Li, Liandong [1 ]
Wei, Qinglan [1 ]
机构
[1] Beijing Normal Univ, Coll Informat Sci & Technol, Beijing, Peoples R China
[2] Northwest Normal Univ, Sch Business, Lanzhou, Gansu, Peoples R China
来源
关键词
Emotion recognition; Spatio-temporal information; Deep learning; Decision-level fusion; Deep convolutional neural network;
D O I
10.1007/978-981-10-3005-5_51
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we explored a multi-feature based classification framework for the Multimodal Emotion Recognition Challenge, which is part of the Chinese Conference on Pattern Recognition (CCPR 2016). The task of the challenge is to recognize one of eight facial emotions in short video segments extracted from Chinese films, TV plays and talk shows. In our framework, both traditional methods and Deep Convolutional Neural Network (DCNN) methods are used to extract various features. With different features, different classifiers are trained to predict video emotion labels. Moreover, a decision-level fusion method is explored to aggregate these different prediction results. According to the results on the competition database, our method shows better effectiveness on Chinese facial emotion.
引用
收藏
页码:621 / 631
页数:11
相关论文
共 50 条
  • [1] Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning
    Mocanu, Bogdan
    Tapu, Ruxandra
    Zaharia, Titus
    [J]. IMAGE AND VISION COMPUTING, 2023, 133
  • [2] Audio-Video Fusion with Double Attention for Multimodal Emotion Recognition
    Mocanu, Bogdan
    Tapu, Ruxandra
    [J]. 2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [3] Audio-Video based Emotion Recognition Using Minimum Cost Flow Algorithm
    Nguyen, Xuan-Bac
    Lee, Guee-Sang
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3737 - 3741
  • [4] Audio and Video-based Emotion Recognition using Multimodal Transformers
    John, Vijay
    Kawanishi, Yasutomo
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2582 - 2588
  • [5] Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition
    Zhou, Hengshun
    Meng, Debin
    Zhang, Yuanyuan
    Peng, Xiaojiang
    Du, Jun
    Wang, Kai
    Qiao, Yu
    [J]. ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 562 - 566
  • [6] Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition
    Zhang, Yuanyuan
    Wang, Zi-Rui
    Du, Jun
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [7] Deep Learning and Audio Based Emotion Recognition
    Demir, Asli
    Atila, Orhan
    Sengur, Abdulkadir
    [J]. 2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [8] Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video
    Wang, Zhongmin
    Zhou, Xiaoxiao
    Wang, Wenlang
    Liang, Chen
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (04) : 923 - 934
  • [9] Emotion recognition using multimodal deep learning in multiple psychophysiological signals and video
    Zhongmin Wang
    Xiaoxiao Zhou
    Wenlang Wang
    Chen Liang
    [J]. International Journal of Machine Learning and Cybernetics, 2020, 11 : 923 - 934
  • [10] Emotion Recognition Using Multimodal Deep Learning
    Liu, Wei
    Zheng, Wei-Long
    Lu, Bao-Liang
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 521 - 529