Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry, and Fusion

被引:133
|
作者
Wang, Yang [1 ,2 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Intelligent Interconnected Syst Lab Anhui Prov, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-modal data; deep neural networks; MULTIVIEW; REPRESENTATIONS; RECOGNITION; NETWORK;
D O I
10.1145/3408317
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects. Often, different modalities are complementary to each other. This fact motivated a lot of research attention on fusing the multi-modal feature spaces to comprehensively characterize the data objects. Most of the existing state-of-the-arts focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal. Recently, deep neural networks have been exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data, so naturally does for multi-modal data. Substantial empirical studies are carried out to demonstrate its advantages that are benefited from deep multi-modal methods, which can essentially deepen the fusion from multi-modal deep feature spaces. In this article, we provide a substantial overview of the existing state-of-the-arts in the field of multi-modal data analytics from shallow to deep spaces. Throughout this survey, we further indicate that the critical components for this field go to collaboration, adversarial competition, and fusion over multi-modal spaces. Finally, we share our viewpoints regarding some future directions in this field.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Deep multi-modal intermediate fusion of clinical record and time series data in mortality prediction
    Niu, Ke
    Zhang, Ke
    Peng, Xueping
    Pan, Yijie
    Xiao, Naian
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2023, 10
  • [32] Multi-modal fusion deep learning model for excavated soil heterogeneous data with efficient classification
    Guo, Qi-Meng
    Zhan, Liang-Tong
    Yin, Zhen-Yu
    Feng, Hang
    Yang, Guang-Qian
    Chen, Yun-Min
    COMPUTERS AND GEOTECHNICS, 2024, 175
  • [33] Multi-Modal and Multi-Temporal Data Fusion: Outcome of the 2012 GRSS Data Fusion Contest
    Berger, Christian
    Voltersen, Michael
    Eckardt, Robert
    Eberle, Jonas
    Heyer, Thomas
    Salepci, Nesrin
    Hese, Soeren
    Schmullius, Christiane
    Tao, Junyi
    Auer, Stefan
    Bamler, Richard
    Ewald, Ken
    Gartley, Michael
    Jacobson, John
    Buswell, Alan
    Du, Qian
    Pacifici, Fabio
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2013, 6 (03) : 1324 - 1340
  • [34] A Survey on Multi-modal Summarization
    Jangra, Anubhav
    Mukherjee, Sourajit
    Jatowt, Adam
    Saha, Sriparna
    Hasanuzzaman, Mohammad
    ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [35] A Novel Framework for Multi-Modal Data Fusion in Radiation Oncology
    Ganguly, S.
    Ma, R.
    Polvorosa, C.
    Baker, J.
    Cao, Y.
    Chang, J.
    MEDICAL PHYSICS, 2024, 51 (10) : 7958 - 7959
  • [36] Interactive Fusion and Tracking For Multi-Modal Spatial Data Visualization
    Elshehaly, M.
    Gracanin, D.
    Gad, M.
    Elmongui, H. G.
    Matkovic, K.
    COMPUTER GRAPHICS FORUM, 2015, 34 (03) : 251 - 260
  • [37] SPFUSIONNET: SKETCH SEGMENTATION USING MULTI-MODAL DATA FUSION
    Wang, Fei
    Lin, Shujin
    Wu, Hefeng
    Li, Hanhui
    Wang, Ruomei
    Luo, Xiaonan
    He, Xiangjian
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1654 - 1659
  • [38] ACMTF for Fusion of Multi-Modal Neuroimaging Data and Identification of Biomarkers
    Acar, Evrim
    Levin-Schwartz, Yuri
    Calhoun, Vince D.
    Adali, Tulay
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 643 - 647
  • [39] Multi-modal Data Fusion For Pain Intensity Assessment and Classification
    Thiam, Patrick
    Schwenker, Friedhelm
    PROCEEDINGS OF THE 2017 SEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA 2017), 2017,
  • [40] Latent correlation embedded discriminative multi-modal data fusion
    Zhu, Qi
    Xu, Xiangyu
    Yuan, Ning
    Zhang, Zheng
    Guan, Donghai
    Huang, Sheng-Jun
    Zhang, Daoqiang
    SIGNAL PROCESSING, 2020, 171