LEARNING UNIFIED SPARSE REPRESENTATIONS FOR MULTI-MODAL DATA

被引:0
|
作者
Wang, Kaiye [1 ]
Wang, Wei [1 ]
Wang, Liang [1 ]
机构
[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Ctr Res Intelligent Percept & Comp, Inst Automat, Beijing 100190, Peoples R China
关键词
Cross-modal retrieval; unified representation learning; joint dictionary learning; multi-modal data;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cross-modal retrieval has become one of interesting and important research problem recently, where users can take one modality of data (e.g., text, image or video) as the query to retrieve relevant data of another modality. In this paper, we present a Multi-modal Unified Representation Learning (MURL) algorithm for cross-modal retrieval, which learns unified sparse representations for multi-modal data representing the same semantics via joint dictionary learning. The l(1)-norm is imposed on the unified representations to explicitly encourage sparsity, which makes our algorithm more robust. Furthermore, a constraint regularization term is imposed to force the representations to be similar if their corresponding multi-modal data have must-links or to be far apart if their corresponding multi-modal data have cannot-links. An iterative algorithm is also proposed to solve the objective function. The effectiveness of the proposed method is verified by extensive results on two real-world datasets.
引用
收藏
页码:3545 / 3549
页数:5
相关论文
共 50 条
  • [31] Multi-Modal Sparse Tracking by Jointing Timing and Modal Consistency
    Li, Jiajun
    Fang, Bin
    Zhou, Mingliang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (06)
  • [32] Multi-modal Relation Distillation for Unified 3D Representation Learning
    Wang, Huiqun
    Bao, Yiping
    Pan, Panwang
    Li, Zeming
    Liu, Xiao
    Yang, Ruijie
    Huang, Di
    COMPUTER VISION - ECCV 2024, PT XXXIII, 2025, 15091 : 364 - 381
  • [33] Unsupervised Multi-modal Learning
    Iqbal, Mohammed Shameer
    ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 343 - 346
  • [34] Learning Multi-modal Similarity
    McFee, Brian
    Lanckriet, Gert
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 491 - 523
  • [35] Learning Multi-Modal Biomarker Representations via Globally Aligned Longitudinal Enrichments
    Lu, Lyujian
    Elbeleidy, Saad
    Baker, Lauren Zoe
    Wang, Hua
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 817 - 824
  • [36] Unified Multi-Modal Data Aggregation for Complementary Sensor Networks Applied for Localization
    Berndt, Maximilian
    Krummacker, Dennis
    Fischer, Christoph
    Schotten, Hans D.
    2022 IEEE 95TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-SPRING), 2022,
  • [37] Heterogeneous Graph Learning for Multi-Modal Medical Data Analysis
    Kim, Sein
    Lee, Namkyeong
    Lee, Junseok
    Hyun, Dongmin
    Park, Chanyoung
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5141 - 5150
  • [38] SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
    Xie, Yichen
    Xu, Chenfeng
    Rakotosaona, Marie-Julie
    Rim, Patrick
    Tombari, Federico
    Keutzer, Kurt
    Tomizuka, Masayoshi
    Zhan, Wei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17545 - 17556
  • [39] Understanding Fun in Learning to Code: A Multi-Modal Data approach
    Tisza, Gabriella
    Sharma, Kshitij
    Papavlasopoulou, Sofia
    Markopoulos, Panos
    Giannakos, Michail
    PROCEEDINGS OF THE 2022 ACM INTERACTION DESIGN AND CHILDREN, IDC 2022, 2022, : 274 - 287
  • [40] Towards a systematic multi-modal representation learning for network data
    Ben Houidi, Zied
    Azorin, Raphael
    Gallo, Massimo
    Finamore, Alessandro
    Rossi, Dario
    THE 21ST ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2022, 2022, : 181 - 187