Kernel Fusion of Audio and Visual Information for Emotion Recognition

被引：0

作者：

Wang, Yongjin ^{[1
]}

Zhang, Rui ^{[1
]}

Guan, Ling ^{[1
]}

Venetsanopoulos, A. N. ^{[1
]}

机构：

[1] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON, Canada

来源：

IMAGE ANALYSIS AND RECOGNITION: 8TH INTERNATIONAL CONFERENCE, ICIAR 2011, PT II: 8TH INTERNATIONAL CONFERENCE, ICIAR 2011 | 2011年 / 6754卷

关键词：

Audiovisual emotion recognition; kernel methods; multimodal information fusion; DISCRIMINANT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effective analysis and recognition of human emotional behavior are important for achieving efficient and intelligent human computer interaction. This paper presents an approach for audiovisual based multimodal emotion recognition. The proposed solution integrates the audio and visual information by fusing the kernel matrices of respective channels through algebraic operations, followed by dimensionality reduction techniques to map the original disparate features to a nonlinearly transformed joint subspace. A hidden Markov model is employed for characterizing the statistical dependence across successive frames, and identifying the inherent temporal structure of the features. We examine the kernel fusion method at both feature and score levels. The effectiveness of the proposed method is demonstrated through extensive experimentation.

引用

页码：140 / 150

页数：11

共 50 条

[21] Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem
Sidorov, Maxim
Sopov, Evgenii
Ivanov, Ilia
Minker, Wolfgang
ICIMCO 2015 PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL. 2, 2015, : 246 - 251
[22] Emotion Recognition Using Fusion of Audio and Video Features
Ortega, Juan D. S.
Cardinal, Patrick
Koerich, Alessandro L.
2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3847 - 3852
[23] AN AUDIO VISUAL EMOTION RECOGNITION SYSTEM USING DEEP LEARNING FUSION FOR A COGNITIVE WIRELESS FRAMEWORK
Hossain, M. Shamim
Muhammad, Ghulam
IEEE WIRELESS COMMUNICATIONS, 2019, 26 (03) : 62 - 68
[24] A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition
Praveen, R. Gnana
de Melo, Wheidima Carneiro
Ullah, Nasib
Aslam, Haseeb
Zeeshan, Osama
Denorme, Theo
Pedersoli, Marco
Koerich, Alessandro L.
Bacon, Simon
Cardinal, Patrick
Granger, Eric
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2485 - 2494
[25] Audio-Visual Learning for Multimodal Emotion Recognition
Fan, Siyu
Jing, Jianan
Wang, Chongwen
SYMMETRY-BASEL, 2025, 17 (03):
[26] Audio-Visual Attention Networks for Emotion Recognition
Lee, Jiyoung
Kim, Sunok
Kim, Seungryong
Sohn, Kwanghoon
AVSU'18: PROCEEDINGS OF THE 2018 WORKSHOP ON AUDIO-VISUAL SCENE UNDERSTANDING FOR IMMERSIVE MULTIMEDIA, 2018, : 27 - 32
[27] Emotion recognition based on joint visual and audio cues
Sebe, Nicu
Cohen, Ira
Gevers, Theo
Huang, Thomas S.
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1136 - +
[28] Deep operational audio-visual emotion recognition
Akturk, Kaan
Keceli, Ali Seydi
NEUROCOMPUTING, 2024, 588
[29] Audio-Visual Emotion Recognition in Video Clips
Noroozi, Fatemeh
Marjanovic, Marina
Njegus, Angelina
Escalera, Sergio
Anbarjafari, Gholamreza
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (01) : 60 - 75
[30] Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition
Zhou, Hengshun
Du, Jun
Zhang, Yuanyuan
Wang, Qing
Liu, Qing-Feng
Lee, Chin-Hui
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2617 - 2629

← 1 2 3 4 5 →