Kernel Fusion of Audio and Visual Information for Emotion Recognition

被引:0
|
作者
Wang, Yongjin [1 ]
Zhang, Rui [1 ]
Guan, Ling [1 ]
Venetsanopoulos, A. N. [1 ]
机构
[1] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON, Canada
关键词
Audiovisual emotion recognition; kernel methods; multimodal information fusion; DISCRIMINANT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective analysis and recognition of human emotional behavior are important for achieving efficient and intelligent human computer interaction. This paper presents an approach for audiovisual based multimodal emotion recognition. The proposed solution integrates the audio and visual information by fusing the kernel matrices of respective channels through algebraic operations, followed by dimensionality reduction techniques to map the original disparate features to a nonlinearly transformed joint subspace. A hidden Markov model is employed for characterizing the statistical dependence across successive frames, and identifying the inherent temporal structure of the features. We examine the kernel fusion method at both feature and score levels. The effectiveness of the proposed method is demonstrated through extensive experimentation.
引用
收藏
页码:140 / 150
页数:11
相关论文
共 50 条
  • [21] Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem
    Sidorov, Maxim
    Sopov, Evgenii
    Ivanov, Ilia
    Minker, Wolfgang
    ICIMCO 2015 PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL. 2, 2015, : 246 - 251
  • [22] Emotion Recognition Using Fusion of Audio and Video Features
    Ortega, Juan D. S.
    Cardinal, Patrick
    Koerich, Alessandro L.
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3847 - 3852
  • [23] AN AUDIO VISUAL EMOTION RECOGNITION SYSTEM USING DEEP LEARNING FUSION FOR A COGNITIVE WIRELESS FRAMEWORK
    Hossain, M. Shamim
    Muhammad, Ghulam
    IEEE WIRELESS COMMUNICATIONS, 2019, 26 (03) : 62 - 68
  • [24] A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition
    Praveen, R. Gnana
    de Melo, Wheidima Carneiro
    Ullah, Nasib
    Aslam, Haseeb
    Zeeshan, Osama
    Denorme, Theo
    Pedersoli, Marco
    Koerich, Alessandro L.
    Bacon, Simon
    Cardinal, Patrick
    Granger, Eric
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2485 - 2494
  • [25] Audio-Visual Learning for Multimodal Emotion Recognition
    Fan, Siyu
    Jing, Jianan
    Wang, Chongwen
    SYMMETRY-BASEL, 2025, 17 (03):
  • [26] Audio-Visual Attention Networks for Emotion Recognition
    Lee, Jiyoung
    Kim, Sunok
    Kim, Seungryong
    Sohn, Kwanghoon
    AVSU'18: PROCEEDINGS OF THE 2018 WORKSHOP ON AUDIO-VISUAL SCENE UNDERSTANDING FOR IMMERSIVE MULTIMEDIA, 2018, : 27 - 32
  • [27] Emotion recognition based on joint visual and audio cues
    Sebe, Nicu
    Cohen, Ira
    Gevers, Theo
    Huang, Thomas S.
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1136 - +
  • [28] Deep operational audio-visual emotion recognition
    Akturk, Kaan
    Keceli, Ali Seydi
    NEUROCOMPUTING, 2024, 588
  • [29] Audio-Visual Emotion Recognition in Video Clips
    Noroozi, Fatemeh
    Marjanovic, Marina
    Njegus, Angelina
    Escalera, Sergio
    Anbarjafari, Gholamreza
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (01) : 60 - 75
  • [30] Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition
    Zhou, Hengshun
    Du, Jun
    Zhang, Yuanyuan
    Wang, Qing
    Liu, Qing-Feng
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2617 - 2629