Multimodal Machine Learning: A Survey and Taxonomy

被引:1795
|
作者
Baltrusaitis, Tadas [1 ]
Ahuja, Chaitanya [2 ]
Morency, Louis-Philippe [2 ]
机构
[1] Microsoft Corp, Cambridge CB1 2FB, England
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Multimodal; machine learning; introductory; survey; EMOTION RECOGNITION; NEURAL-NETWORKS; SPEECH; TEXT; FUSION; VIDEO; LANGUAGE; MODELS; GENERATION; ALIGNMENT;
D O I
10.1109/TPAMI.2018.2798607
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.
引用
收藏
页码:423 / 443
页数:21
相关论文
共 50 条
  • [1] A survey of multimodal machine learning
    Chen P.
    Li Q.
    Zhang D.-Z.
    Yang Y.-H.
    Cai Z.
    Lu Z.-Y.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2020, 42 (05): : 557 - 569
  • [2] Machine Learning into Metaheuristics: A Survey and Taxonomy
    Talbi, El-Ghazali
    ACM COMPUTING SURVEYS, 2021, 54 (06)
  • [3] Taxonomy of Machine Learning Safety: A Survey and Primer
    Mohseni, Sina
    Wang, Haotao
    Xiao, Chaowei
    Yu, Zhiding
    Wang, Zhangyang
    Yadawa, Jay
    ACM COMPUTING SURVEYS, 2023, 55 (08)
  • [4] A taxonomy and survey of attacks against machine learning
    Pitropakis, Nikolaos
    Panaousis, Emmanouil
    Giannetsos, Thanassis
    Anastasiadis, Eleftherios
    Loukas, George
    COMPUTER SCIENCE REVIEW, 2019, 34
  • [5] Taxonomy and Survey of Interpretable Machine Learning Method
    Das, Saikat
    Agarwal, Namita
    Venugopal, Deepak
    Sheldon, Frederick T.
    Shiva, Sajjan
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 670 - 677
  • [6] Transfer Learning for Radio Frequency Machine Learning: A Taxonomy and Survey
    Wong, Lauren J.
    Michaels, Alan J.
    SENSORS, 2022, 22 (04)
  • [7] Informed Machine Learning - A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems
    von Rueden, Laura
    Mayer, Sebastian
    Beckh, Katharina
    Georgiev, Bogdan
    Giesselbach, Sven
    Heese, Raoul
    Kirsch, Birgit
    Pfrommer, Julius
    Pick, Annika
    Ramamurthy, Rajkumar
    Walczak, Michal
    Garcke, Jochen
    Bauckhage, Christian
    Schuecker, Jannis
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 614 - 633
  • [8] Mental health monitoring with multimodal sensing and machine learning: A survey
    Garcia-Ceja, Enrique
    Riegler, Michael
    Nordgreen, Tine
    Jakobsen, Petter
    Oedegaard, Ketil J.
    Torresen, Jim
    PERVASIVE AND MOBILE COMPUTING, 2018, 51 : 1 - 26
  • [9] A Comprehensive Survey for IoT Security Datasets Taxonomy, Classification and Machine Learning Mechanisms
    Alex, Christin
    Creado, Giselle
    Almobaideen, Wesam
    Abu Alghanam, Orieb
    Saadeh, Maha
    COMPUTERS & SECURITY, 2023, 132
  • [10] Machine learning techniques in emerging cloud computing integrated paradigms: A survey and taxonomy
    Soni, Dinesh
    Kumar, Neetesh
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2022, 205