Supervised cross-modal factor analysis for multiple modal data classification

被引:13
|
作者
Wang, Jingbin [1 ,2 ]
Zhou, Yihua [3 ]
Duan, Kanghong [4 ]
Wang, Jim Jing-Yan [5 ]
Bensmail, Halima [6 ]
机构
[1] Chinese Acad Sci, Natl Time Serv Ctr, Xian 710600, Peoples R China
[2] Chinese Acad Sci, Grad Univ, Beijing 100039, Peoples R China
[3] Lehigh Univ, Dept Mech Engn & Mech, Bethlehem, PA 18015 USA
[4] State Ocean Adm, North China Sea Marine Tech Support Ctr, Qingdao 266033, Peoples R China
[5] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 23955, Saudi Arabia
[6] Qatar Comp Res Inst, Doha 5825, Qatar
关键词
Multiple modal learning; Cross-modal factor analysis; Supervised learning; SPARSE REPRESENTATION; TEXT CLASSIFICATION; SURFACE; ACTIVATION; NETWORK;
D O I
10.1109/SMC.2015.329
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.
引用
收藏
页码:1882 / 1888
页数:7
相关论文
共 50 条
  • [31] Information Fusion via Deep Cross-Modal Factor Analysis
    Gao, Lei
    Guan, Ling
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [32] HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval
    Zhang, Chengyuan
    Song, Jiayu
    Zhu, Xiaofeng
    Zhu, Lei
    Zhang, Shichao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [33] Cross-Modal Multivariate Pattern Analysis
    Meyer, Kaspar
    Kaplan, Jonas T.
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2011, (57):
  • [34] Cross-modal deep discriminant analysis
    Dai, Xue-mei
    Li, Sheng-Gang
    NEUROCOMPUTING, 2018, 314 : 437 - 444
  • [35] Onsets Coincidence for Cross-Modal Analysis
    Barzelay, Zohar
    Schechner, Yoav Y.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (02) : 108 - 120
  • [36] Cross-modal Multiple Granularity Interactive Fusion Network for Long Document Classification
    Liu, Tengfei
    Hu, Yongli
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (04)
  • [37] Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching
    Liang, Jingjun
    Li, Ruichen
    Jin, Qin
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2852 - 2861
  • [38] Adaptive Asymmetric Supervised Cross-Modal Hashing with consensus matrix
    Li, Yinan
    Long, Jun
    Huang, Youyuan
    Yang, Zhan
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [39] Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval
    Meng, Min
    Wang, Haitao
    Yu, Jun
    Chen, Hui
    Wu, Jigang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 986 - 1000
  • [40] Semi-Relaxation Supervised Hashing for Cross-Modal Retrieval
    Zhang, Peng-Fei
    Li, Chuan-Xiang
    Liu, Meng-Yuan
    Nie, Liqiang
    Xu, Xin-Shun
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1762 - 1770