Research on Emotion Classification Based on Multi-modal Fusion

被引：1

作者：

Xiang, Zhihua ^{[1
,2
]}

Radzi, Nor Haizan Mohamed ^{[1
]}

Hashim, Haslina ^{[1
]}

机构：

[1] Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Johor, Malaysia

[2] Guangdong Technol Coll, 526100 Qifu Ave, Zhaoqing, Guangdong, Peoples R China

来源：

BAGHDAD SCIENCE JOURNAL | 2024年 / 21卷 / 02期

关键词：

Dynamic correlation; Feature matching; Multi-modal emotion classification; Match fusion; Temporal attention;

D O I：

10.21123/bsj.2024.9454

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Nowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of the three-dimensional dynamic expansion is established based on the common multi-modal data, for example video , sound ,text.Based on the framework, a multi-modal fusion-matched framework based on spatial and temporal feature enhancement, respectively to solve the dynamic correlation within and between modes, and then model the short and long term dynamic correlation information between different modes based on the proposed framework. Multiple group experiments performed on MOSI datasets show that the emotion recognition model constructed based on the framework proposed here in this paper can better utilize the more complex complementary information between different modal data. Compared with other framework proposed in this paper significantly improves the emotion recognition rate and accuracy when applied to multi-modal emotion analysis, so it is more feasible and effective.

引用

页码：548 / 560

页数：13

共 50 条

[1] Research on Multi-modal Music Emotion Classification Based on Audio and Lyirc
Liu, Gaojun
Tan, Zhiyuan
[J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2331 - 2335
[2] Multi-Modal Fusion Emotion Recognition Based on HMM and ANN
Xu, Chao
Cao, Tianyi
Feng, Zhiyong
Dong, Caichao
[J]. CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 541 - 550
[3] A multi-modal emotion fusion classification method combined expression and speech based on attention mechanism
Liu, Dong
Chen, Longxi
Wang, Lifeng
Wang, Zhiyong
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 41677 - 41695
[4] A multi-modal emotion fusion classification method combined expression and speech based on attention mechanism
Dong Liu
Longxi Chen
Lifeng Wang
Zhiyong Wang
[J]. Multimedia Tools and Applications, 2022, 81 : 41677 - 41695
[5] Toward Multi-modal Music Emotion Classification
Yang, Yi-Hsuan
Lin, Yu-Ching
Cheng, Heng-Tze
Liao, I-Bin
Ho, Yeh-Chin
Chen, Homer H.
[J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 70 - +
[6] News video classification based on multi-modal information fusion
Lie, WN
Su, CK
[J]. 2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1021 - 1024
[7] Disease Classification Model Based on Multi-Modal Feature Fusion
Wan, Zhengyu
Shao, Xinhui
[J]. IEEE ACCESS, 2023, 11 : 27536 - 27545
[8] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
Priyasad, Darshana
Fernando, Tharindu
Denman, Simon
Sridharan, Sridha
Fookes, Clinton
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
[9] Improved Sentiment Classification by Multi-modal Fusion
Gan, Lige
Benlamri, Rachid
Khoury, Richard
[J]. 2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 11 - 16
[10] Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information
Ghoniem, Rania M.
Algarni, Abeer D.
Shaalan, Khaled
[J]. INFORMATION, 2019, 10 (07)

← 1 2 3 4 5 →