Modout: Learning Multi-modal Architectures by Stochastic Regularization

被引:9
|
作者
Li, Fan [1 ]
Neverova, Natalia [2 ]
Wolf, Christian [3 ]
Taylor, Graham [1 ]
机构
[1] Univ Guelph, Sch Engn, Guelph, ON, Canada
[2] Facebook, Paris, France
[3] INSA Lyon, LIRIS, Lyon, France
关键词
D O I
10.1109/FG.2017.59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model selection methods based on stochastic regularization have been widely used in deep learning due to their simplicity and effectiveness. The well-known Dropout method treats all units, visible or hidden, in the same way, thus ignoring any a priori information related to grouping or structure. Such structure is present in multi-modal learning applications such as affect analysis and gesture recognition, where subsets of units may correspond to individual modalities. Here we describe Modout, a model selection method based on stochastic regularization, which is particularly useful in the multi-modal setting. Different from other forms of stochastic regularization, it is capable of learning whether or when to fuse two modalities in a layer, which is usually considered to be an architectural hyper-parameter by deep learning researchers and practitioners. Modout is evaluated on two real multi-modal datasets. The results indicate improved performance compared to other forms of stochastic regularization. The result on the Montalbano dataset shows that learning a fusion structure by Modout is on par with a state-of-the-art carefully designed architecture.
引用
收藏
页码:422 / 429
页数:8
相关论文
共 50 条
  • [1] Multi-modal Subspace Learning with Dropout regularization for Cross-modal Recognition and Retrieval
    Cao, Guanqun
    Waris, Muhammad Adeel
    Iosifidis, Alexandros
    Gabbouj, Moncef
    [J]. 2016 SIXTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2016,
  • [2] Multi-modal Subspace Learning with Joint Graph Regularization for Cross-modal Retrieval
    Wang, Kaiye
    Wang, Wei
    He, Ran
    Wang, Liang
    Tan, Tieniu
    [J]. 2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 236 - 240
  • [3] Multi-modal Learning Algorithms and Network Architectures for Information Extraction and Retrieval
    Bleeker, Maurits
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6925 - 6929
  • [4] Multi-modal anchor adaptation learning for multi-modal summarization
    Chen, Zhongfeng
    Lu, Zhenyu
    Rong, Huan
    Zhao, Chuanjun
    Xu, Fan
    [J]. NEUROCOMPUTING, 2024, 570
  • [5] Deep Learning Based Multi-Modal Fusion Architectures for Maritime Vessel Detection
    Farahnakian, Fahimeh
    Heikkonen, Jukka
    [J]. REMOTE SENSING, 2020, 12 (16)
  • [6] Multi-modal advanced deep learning architectures for breast cancer survival prediction
    Arya, Nikhilanand
    Saha, Sriparna
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 221
  • [7] Abnormal Behavior Detection using a Multi-Modal Stochastic Learning Approach
    Bouttefroy, P. L. M.
    Bouzerdoum, A.
    Phung, S. L.
    Beghdadi, A.
    [J]. ISSNIP 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON INTELLIGENT SENSORS, SENSOR NETWORKS, AND INFORMATION PROCESSING, 2008, : 121 - +
  • [8] Unsupervised Multi-modal Learning
    Iqbal, Mohammed Shameer
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 343 - 346
  • [9] Learning Multi-modal Similarity
    McFee, Brian
    Lanckriet, Gert
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 491 - 523
  • [10] Deep learning architectures for Parkinson's disease detection by using multi-modal features
    Pahuja, Gunjan
    Prasad, Bhanu
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146