Research on Emotion Classification Based on Multi-modal Fusion

被引:1
|
作者
Xiang, Zhihua [1 ,2 ]
Radzi, Nor Haizan Mohamed [1 ]
Hashim, Haslina [1 ]
机构
[1] Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Johor, Malaysia
[2] Guangdong Technol Coll, 526100 Qifu Ave, Zhaoqing, Guangdong, Peoples R China
关键词
Dynamic correlation; Feature matching; Multi-modal emotion classification; Match fusion; Temporal attention;
D O I
10.21123/bsj.2024.9454
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Nowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of the three-dimensional dynamic expansion is established based on the common multi-modal data, for example video , sound ,text.Based on the framework, a multi-modal fusion-matched framework based on spatial and temporal feature enhancement, respectively to solve the dynamic correlation within and between modes, and then model the short and long term dynamic correlation information between different modes based on the proposed framework. Multiple group experiments performed on MOSI datasets show that the emotion recognition model constructed based on the framework proposed here in this paper can better utilize the more complex complementary information between different modal data. Compared with other framework proposed in this paper significantly improves the emotion recognition rate and accuracy when applied to multi-modal emotion analysis, so it is more feasible and effective.
引用
收藏
页码:548 / 560
页数:13
相关论文
共 50 条
  • [1] Research on Multi-modal Music Emotion Classification Based on Audio and Lyirc
    Liu, Gaojun
    Tan, Zhiyuan
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2331 - 2335
  • [2] Multi-Modal Fusion Emotion Recognition Based on HMM and ANN
    Xu, Chao
    Cao, Tianyi
    Feng, Zhiyong
    Dong, Caichao
    [J]. CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 541 - 550
  • [3] A multi-modal emotion fusion classification method combined expression and speech based on attention mechanism
    Liu, Dong
    Chen, Longxi
    Wang, Lifeng
    Wang, Zhiyong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 41677 - 41695
  • [4] A multi-modal emotion fusion classification method combined expression and speech based on attention mechanism
    Dong Liu
    Longxi Chen
    Lifeng Wang
    Zhiyong Wang
    [J]. Multimedia Tools and Applications, 2022, 81 : 41677 - 41695
  • [5] Toward Multi-modal Music Emotion Classification
    Yang, Yi-Hsuan
    Lin, Yu-Ching
    Cheng, Heng-Tze
    Liao, I-Bin
    Ho, Yeh-Chin
    Chen, Homer H.
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 70 - +
  • [6] News video classification based on multi-modal information fusion
    Lie, WN
    Su, CK
    [J]. 2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1021 - 1024
  • [7] Disease Classification Model Based on Multi-Modal Feature Fusion
    Wan, Zhengyu
    Shao, Xinhui
    [J]. IEEE ACCESS, 2023, 11 : 27536 - 27545
  • [8] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
    Priyasad, Darshana
    Fernando, Tharindu
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
  • [9] Improved Sentiment Classification by Multi-modal Fusion
    Gan, Lige
    Benlamri, Rachid
    Khoury, Richard
    [J]. 2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 11 - 16
  • [10] Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information
    Ghoniem, Rania M.
    Algarni, Abeer D.
    Shaalan, Khaled
    [J]. INFORMATION, 2019, 10 (07)