Multi-Modal Emotion Recognition Fusing Video and Audio

被引：4

作者：

Xu, Chao ^{[1
]}

Du, Pufeng ^{[2
]}

Feng, Zhiyong ^{[2
]}

Meng, Zhaopeng ^{[1
]}

Cao, Tianyi ^{[2
]}

Dong, Caichao ^{[2
]}

机构：

[1] Tianjin Univ, Sch Comp Software, Tianjin 300072, Peoples R China

[2] Tianjin Univ, Sch Comp Sci & Technol, Tianjin 300072, Peoples R China

来源：

APPLIED MATHEMATICS & INFORMATION SCIENCES | 2013年 / 7卷 / 02期

基金：

美国国家科学基金会;

关键词：

Emotion Recognition; Multi-modal Fusion; HMM; Multi-layer Perceptron;

D O I：

10.12785/amis/070205

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Emotion plays an important role in human communications. We construct a framework for multi-modal fusion emotion recognition. Facial expression features and speech features are respectively extracted from image sequences and speech signals. In order to locate and track facial feature points, we construct an Active Appearance Model for facial images with all kinds of expressions. Facial Animation Parameters are calculated from motions of facial feature points as expression features. We extract short-term mean energy, fundamental frequency and formant frequencies from each frame as speech features. An emotion classifier is designed to fuse facial expression and speech based on Hidden Markov Models and Multi-layer Perceptron. Experiments indicate that multi-modal fusion emotion recognition algorithm which is presented in this paper has relatively high recognition accuracy. The proposed approach has better performance and robustness than methods using only video or audio separately.

引用

页码：455 / 462

页数：8

共 50 条

[31] Multi-modal Emotion Recognition for Determining Employee Satisfaction
Zaman, Farhan Uz
Zaman, Maisha Tasnia
Alam, Md Ashraful
Alam, Md Golam Rabiul
2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
[32] Semantic Alignment Network for Multi-Modal Emotion Recognition
Hou, Mixiao
Zhang, Zheng
Liu, Chang
Lu, Guangming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5318 - 5329
[33] Emotion recognition with multi-modal peripheral physiological signals
Gohumpu, Jennifer
Xue, Mengru
Bao, Yanchi
FRONTIERS IN COMPUTER SCIENCE, 2023, 5
[34] Facial emotion recognition using multi-modal information
De Silva, LC
Miyasato, T
Nakatsu, R
ICICS - PROCEEDINGS OF 1997 INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING, VOLS 1-3: THEME: TRENDS IN INFORMATION SYSTEMS ENGINEERING AND WIRELESS MULTIMEDIA COMMUNICATIONS, 1997, : 397 - 401
[35] Driver multi-task emotion recognition network based on multi-modal facial video analysis
Xiang, Guoliang
Yao, Song
Wu, Xianhui
Deng, Hanwen
Wang, Guojie
Liu, Yu
Li, Fan
Peng, Yong
PATTERN RECOGNITION, 2025, 161
[36] Lightweight multi-modal emotion recognition model based on modal generation
Liu, Peisong
Che, Manqiang
Luo, Jiangchuan
2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 430 - 435
[37] Cross-modal dynamic convolution for multi-modal emotion recognition
Wen, Huanglu
You, Shaodi
Fu, Ying
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
[38] Multi-modal emotion recognition in conversation based on prompt learning with text-audio fusion features
Wu, Yuezhou
Zhang, Siling
Li, Pengfei
SCIENTIFIC REPORTS, 2025, 15 (01):
[39] Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived Labels
Lei, Yuanyuan
Cao, Houwei
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2954 - 2969
[40] Research on Multi-modal Music Emotion Classification Based on Audio and Lyirc
Liu, Gaojun
Tan, Zhiyuan
PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2331 - 2335

← 1 2 3 4 5 →