MULTI-VIEW FUSION THROUGH CROSS-MODAL RETRIEVAL

被引:0
|
作者
Cui, Limeng [1 ]
Chen, Zhensong [2 ]
Zhang, Jiawei [3 ]
He, Lifang [4 ]
Shi, Yong [2 ]
Yu, Philip S. [5 ]
机构
[1] Univ Chinese Acad Sci, Sch Comp & Control Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Econ & Management, Beijing, Peoples R China
[3] Florida State Univ, Dept Comp Sci, IFM Lab, Tallahassee, FL 32306 USA
[4] Cornell Univ, Weill Cornell Med, Ithaca, NY 14853 USA
[5] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Cross modal retrieval; tensor modeling; multi-view learning;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cross-modal retrieval, which takes text queries to retrieve relevant images or vice versa, has drawn much attention in recent years. This topic exhibits dual-heterogeneity: heterogeneity of different modalities and heterogeneous features obtained from multiple views. To address this issue, we propose an effective multi-view fusion method for cross-modal retrieval based on tensor modeling (CMTM) for cross-modal retrieval from the full-order feature interactions within the multimodal data. In order to facilitate integration of heterogeneous features from multiple views, we adopt the tensor structure to model the full-order interactions among the multi-view features effectively. Besides, a tensor factorization is applied to derive model parameters. Extensive experiments demonstrate the effectiveness of CMTM on cross-modal retrieval.
引用
收藏
页码:1977 / 1981
页数:5
相关论文
共 50 条
  • [1] ROBUST MULTI-VIEW HASHING FOR CROSS-MODAL RETRIEVAL
    Wang, Haitao
    Chen, Hui
    Meng, Min
    Wu, JiGang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1012 - 1017
  • [2] MVItem: A Benchmark for Multi-View Cross-Modal Item Retrieval
    Li, Bo
    Zhu, Jiansheng
    Dai, Linlin
    Jing, Hui
    Huang, Zhizheng
    Sui, Yuteng
    [J]. IEEE ACCESS, 2024, 12 : 119563 - 119576
  • [3] Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval
    Cao, Guanqun
    Iosifidis, Alexandros
    Chen, Ke
    Gabbouj, Moncef
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (09) : 2542 - 2555
  • [4] Multi-view visual semantic embedding for cross-modal image–text retrieval
    Li, Zheng
    Guo, Caili
    Wang, Xin
    Zhang, Hao
    Hu, Lin
    [J]. Pattern Recognition, 2025, 159
  • [5] Multi-view Multi-label Canonical Correlation Analysis for Cross-modal Matching and Retrieval
    Sanghavi, Rushil
    Verma, Yashaswi
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4700 - 4709
  • [6] Learning discriminative hashing codes for cross-modal retrieval based on multi-view features
    Jun Yu
    Xiao-Jun Wu
    Josef Kittler
    [J]. Pattern Analysis and Applications, 2020, 23 : 1421 - 1438
  • [7] Learning discriminative hashing codes for cross-modal retrieval based on multi-view features
    Yu, Jun
    Wu, Xiao-Jun
    Kittler, Josef
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (03) : 1421 - 1438
  • [8] Multi-view collective tensor decomposition for cross-modal hashing
    Limeng Cui
    Jiawei Zhang
    Lifang He
    Philip S. Yu
    [J]. International Journal of Multimedia Information Retrieval, 2019, 8 : 47 - 59
  • [9] Multi-view collective tensor decomposition for cross-modal hashing
    Cui, Limeng
    Zhang, Jiawei
    He, Lifang
    Yu, Philip S.
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2019, 8 (01) : 47 - 59
  • [10] Multi-view Collective Tensor Decomposition for Cross-modal Hashing
    Cui, Limeng
    Chen, Zhensong
    Zhang, Jiawei
    He, Lifang
    Shi, Yong
    Yu, Philip S.
    [J]. ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 73 - 81