Learning multi-modal dictionaries: Application to audiovisual data

被引:0
|
作者
Monaci, Gianluca [1 ]
Jost, Philippe
Vandergheynst, Pierre
Mailhe, Boris
Lesage, Sylvain
Gribonval, Remi
机构
[1] Ecole Polytech Fed Lausanne, Signal Proc Inst, CH-1015 Lausanne, Switzerland
[2] INRIA, IRISA, F-35042 Rennes, France
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a methodology for extracting meaningful synchronous structures from multi-modal signals. Simultaneous processing of multi-modal data can reveal information that is unavailable when handling the sources separately. However, in natural high-dimensional data, the statistical dependencies between modalities are, most of the time, not obvious. Learning fundamental multi-modal patterns is an alternative to classical statistical methods. Typically, recurrent patterns are shift invariant, thus the learning should try to find the best matching filters. We present a new algorithm for iteratively learning multimodal generating functions that can be shifted at all positions in the signal. The proposed algorithm is applied to audiovisual sequences and it demonstrates to be able to discover underlying structures in the data.
引用
收藏
页码:538 / 545
页数:8
相关论文
共 50 条
  • [1] Multi-modal learning and its application for biomedical data
    Liu, Jin
    Zhang, Yu-Dong
    Cai, Hongming
    [J]. FRONTIERS IN MEDICINE, 2024, 10
  • [2] Learning to Hash on Partial Multi-Modal Data
    Wang, Qifan
    Si, Luo
    Shen, Bin
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3904 - 3910
  • [3] Multi-modal Integration of Dynamic Audiovisual Patterns for an Interactive Reinforcement Learning Scenario
    Cruz, Francisco
    Parisi, German I.
    Twiefel, Johannes
    Wermter, Stefan
    [J]. 2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 759 - 766
  • [4] Multi-modal anchor adaptation learning for multi-modal summarization
    Chen, Zhongfeng
    Lu, Zhenyu
    Rong, Huan
    Zhao, Chuanjun
    Xu, Fan
    [J]. NEUROCOMPUTING, 2024, 570
  • [5] Learning Shared and Specific Factors for Multi-modal Data
    Yin, Qiyue
    Huang, Yan
    Wu, Shu
    Wang, Liang
    [J]. COMPUTER VISION, PT II, 2017, 772 : 89 - 98
  • [6] LEARNING UNIFIED SPARSE REPRESENTATIONS FOR MULTI-MODAL DATA
    Wang, Kaiye
    Wang, Wei
    Wang, Liang
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3545 - 3549
  • [7] Multi-modal Contrastive Learning for Healthcare Data Analytics
    Li, Rui
    Gao, Jing
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 120 - 127
  • [8] Learning Concept Taxonomies from Multi-modal Data
    Zhang, Hao
    Hu, Zhiting
    Deng, Yuntian
    Sachani, Mrinmaya
    Yan, Zhicheng
    Xing, Eric P.
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1791 - 1801
  • [9] A Temporal Dependency Based Multi-modal Active Learning Approach for Audiovisual Event Detection
    Thiam, Patrick
    Meudt, Sascha
    Palm, Guenther
    Schwenker, Friedhelm
    [J]. NEURAL PROCESSING LETTERS, 2018, 48 (02) : 709 - 732
  • [10] A Temporal Dependency Based Multi-modal Active Learning Approach for Audiovisual Event Detection
    Patrick Thiam
    Sascha Meudt
    Günther Palm
    Friedhelm Schwenker
    [J]. Neural Processing Letters, 2018, 48 : 709 - 732