Supervised cross-modal factor analysis for multiple modal data classification

被引:13
|
作者
Wang, Jingbin [1 ,2 ]
Zhou, Yihua [3 ]
Duan, Kanghong [4 ]
Wang, Jim Jing-Yan [5 ]
Bensmail, Halima [6 ]
机构
[1] Chinese Acad Sci, Natl Time Serv Ctr, Xian 710600, Peoples R China
[2] Chinese Acad Sci, Grad Univ, Beijing 100039, Peoples R China
[3] Lehigh Univ, Dept Mech Engn & Mech, Bethlehem, PA 18015 USA
[4] State Ocean Adm, North China Sea Marine Tech Support Ctr, Qingdao 266033, Peoples R China
[5] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 23955, Saudi Arabia
[6] Qatar Comp Res Inst, Doha 5825, Qatar
关键词
Multiple modal learning; Cross-modal factor analysis; Supervised learning; SPARSE REPRESENTATION; TEXT CLASSIFICATION; SURFACE; ACTIVATION; NETWORK;
D O I
10.1109/SMC.2015.329
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.
引用
收藏
页码:1882 / 1888
页数:7
相关论文
共 50 条
  • [1] Joint learning of cross-modal classifier and factor analysis for multimedia data classification
    Duan, Kanghong
    Zhang, Hongxin
    Wang, Jim Jing-Yan
    NEURAL COMPUTING & APPLICATIONS, 2016, 27 (02): : 459 - 468
  • [2] Joint learning of cross-modal classifier and factor analysis for multimedia data classification
    Kanghong Duan
    Hongxin Zhang
    Jim Jing-Yan Wang
    Neural Computing and Applications, 2016, 27 : 459 - 468
  • [3] A semi-supervised cross-modal memory bank for cross-modal retrieval
    Huang, Yingying
    Hu, Bingliang
    Zhang, Yipeng
    Gao, Chi
    Wang, Quan
    NEUROCOMPUTING, 2024, 579
  • [4] Deep Supervised Cross-modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Wang, Xu
    Peng, Dezhong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10386 - 10395
  • [5] Supervised Hierarchical Cross-Modal Hashing
    Sun, Changchang
    Song, Xuemeng
    Feng, Fuli
    Zhao, Wayne Xin
    Zhang, Hao
    Nie, Liqiang
    PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 725 - 734
  • [6] Weakly Supervised Cross-Modal Hashing
    Liu, Xuanwu
    Yu, Guoxian
    Domeniconi, Carlotta
    Wang, Jun
    Xiao, Guoqiang
    Guo, Maozu
    IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (02) : 552 - 563
  • [7] Semi-supervised classification-aware cross-modal deep adversarial data augmentation
    Wang, Shaoqiang
    Wu, Zhenzhen
    He, Gewen
    Wang, Shudong
    Sun, Hongwei
    Fan, Fangfang
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 125 : 194 - 205
  • [8] SUPERVISED CROSS-MODAL HASHING WITHOUT RELAXATION
    Huang, Hua-Junjie
    Yang, Rui
    Li, Chuan-Xiang
    Shi, Yuliang
    Guo, Shanqing
    Xu, Xin-Shun
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1159 - 1164
  • [9] Cross-Modal Retrieval Augmentation for Multi-Modal Classification
    Gur, Shir
    Neverova, Natalia
    Stauffer, Chris
    Lim, Ser-Nam
    Kiela, Douwe
    Reiter, Austin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 111 - 123
  • [10] Federated learning for supervised cross-modal retrieval
    Li, Ang
    Li, Yawen
    Shao, Yingxia
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (04):