Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry, and Fusion

被引:133
|
作者
Wang, Yang [1 ,2 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Intelligent Interconnected Syst Lab Anhui Prov, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal data; deep neural networks; MULTIVIEW; REPRESENTATIONS; RECOGNITION; NETWORK;
D O I
10.1145/3408317
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects. Often, different modalities are complementary to each other. This fact motivated a lot of research attention on fusing the multi-modal feature spaces to comprehensively characterize the data objects. Most of the existing state-of-the-arts focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal. Recently, deep neural networks have been exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data, so naturally does for multi-modal data. Substantial empirical studies are carried out to demonstrate its advantages that are benefited from deep multi-modal methods, which can essentially deepen the fusion from multi-modal deep feature spaces. In this article, we provide a substantial overview of the existing state-of-the-arts in the field of multi-modal data analytics from shallow to deep spaces. Throughout this survey, we further indicate that the critical components for this field go to collaboration, adversarial competition, and fusion over multi-modal spaces. Finally, we share our viewpoints regarding some future directions in this field.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Expression Recognition Survey Through Multi-Modal Data Analytics
    Ramyasree, Kummari
    Kumar, Ch. Sumanth
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 600 - 610
  • [2] Soft multi-modal data fusion
    Coppock, S
    Mazack, L
    PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 636 - 641
  • [3] Multi-modal data fusion: A description
    Coppock, S
    Mazlack, LJ
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 2, PROCEEDINGS, 2004, 3214 : 1136 - 1142
  • [4] A Comprehensive Survey on Deep Learning Multi-Modal Fusion: Methods, Technologies and Applications
    Jiao, Tianzhe
    Guo, Chaopeng
    Feng, Xiaoyue
    Chen, Yuming
    Song, Jie
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (01): : 1 - 35
  • [5] Multi-modal Contrastive Learning for Healthcare Data Analytics
    Li, Rui
    Gao, Jing
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 120 - 127
  • [6] Multi-Modal Data Fusion for Big Events
    Papacharalapous, A. E.
    Hovelynck, Stefan
    Cats, O.
    Lankhaar, J. W.
    Daamen, W.
    van Oort, N.
    van Lint, J. W. C.
    IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE, 2015, 7 (04) : 5 - 10
  • [7] Cardiovascular disease detection based on deep learning and multi-modal data fusion
    Zhu, Jiayuan
    Liu, Hui
    Liu, Xiaowei
    Chen, Chao
    Shu, Minglei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 99
  • [8] Multi-Modal Physiological Data Fusion for Affect Estimation Using Deep Learning
    Hssayeni, Murtadha D.
    Ghoraani, Behnaz
    IEEE ACCESS, 2021, 9 : 21642 - 21652
  • [9] Deep multi-modal data analysis and fusion for robust scene understanding in CAVs
    Papandreou, Andreas
    Kloukiniotis, Andreas
    Lalos, Aris
    Moustakas, Konstantinos
    IEEE MMSP 2021: 2021 IEEE 23RD INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2021,
  • [10] Multi-modal deep fusion for bridge condition assessment
    Momtaz M.
    Li T.
    Harris D.K.
    Lattanzi D.
    Journal of Infrastructure Intelligence and Resilience, 2023, 2 (04):