Multimodal Attentive Representation Learning for Micro-video Multi-label Classification

被引:0
|
作者
Jing, Peiguang [1 ]
Liu, Xianyi [1 ]
Zhang, Lijuan [1 ]
Li, Yun [2 ]
Liu, Yu [1 ]
Su, Yuting [1 ]
机构
[1] Tianjin Univ, Weijin Rd, Tianjin 300072, Peoples R China
[2] Guangxi Univ Finance & Econ, Nanning, Peoples R China
基金
中国国家自然科学基金;
关键词
Micro-video; multimodal representations; multi-label; graph network; NEURAL-NETWORKS;
D O I
10.1145/3643888
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As one of the representative types of user-generated contents (UGCs) in social platforms, micro-videos have been becoming popular in our daily life. Although micro-videos naturally exhibit multimodal features that are rich enough to support representation learning, the complex correlations across modalities render valuable information difficult to integrate. In this paper, we introduced a multimodal attentive representation network (MARNET) to learn complete and robust representations to benefit micro-video multi-label classification. To address the commonly missing modality issue, we presented a multimodal information aggregation mechanism module to integrate multimodal information, where latent common representations are obtained by modeling the complementarity and consistency in terms of visual-centered modality groupings instead of single modalities. For the label correlation issue, we designed an attentive graph neural network module to adaptively learn the correlation matrix and representations of labels for better compatibility with training data. In addition, a cross-modal multi-head attention module is developed to make the learned common representations label-aware for multi-label classification. Experiments conducted on two micro-video datasets demonstrate the superior performance of MARNET compared with state-of-the-art methods.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Research on Micro-video Multi-Label Classification Based on Deep Multimodal Association Learning
    Li, Yun
    Lu, Zhixiang
    Liu, Shuyi
    Wang, Su
    Lü, Zimin
    Jing, Peiguang
    [J]. Data Analysis and Knowledge Discovery, 2024, 8 (07) : 77 - 88
  • [2] Multimodal Progressive Modulation Network for Micro-Video Multi-Label Classification
    Jing, Peiguang
    Zhao, Xuan
    Fan, Fugui
    Yang, Fan
    Li, Yun
    Su, Yuting
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 10134 - 10144
  • [3] Learning Dual Low-Rank Representation for Multi-Label Micro-Video Classification
    Lu, Wei
    Li, Desheng
    Nie, Liqiang
    Jing, Peiguang
    Su, Yuting
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 77 - 89
  • [4] SADCMF: Self-Attentive Deep Consistent Matrix Factorization for Micro-Video Multi-Label Classification
    Fan, Fugui
    Jing, Peiguang
    Nie, Liqiang
    Gu, Haoyu
    Su, Yuting
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 10331 - 10341
  • [5] A Multimodal Aggregation Network With Serial Self-Attention Mechanism for Micro-Video Multi-Label Classification
    Lu, Wei
    Lin, Jiaxin
    Jing, Peiguang
    Su, Yuting
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 60 - 64
  • [6] Multimodal deep hierarchical semantic-aligned matrix factorization method for micro-video multi-label classification
    Fan, Fugui
    Su, Yuting
    Liu, Yun
    Jing, Peiguang
    Qu, Kaihua
    Liu, Yu
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (05)
  • [7] Micro-video multi-label classification method based on multi-modal feature encoding
    Jing P.
    Li Y.
    Su Y.
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (04): : 109 - 117
  • [8] Deep Matrix Factorization With Complementary Semantic Aggregation for Micro-Video Multi-Label Classification
    Jing, Peiguang
    Liu, Xiaoyu
    Wang, Xuehui
    Su, Yuting
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1685 - 1689
  • [9] A deep low-rank semantic factorization method for micro-video multi-label classification
    Fan, Fugui
    Su, Yuting
    Liu, Yun
    Jing, Peiguang
    Qu, Kaihua
    [J]. MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [10] MULTIMODAL LEARNING FOR MULTI-LABEL IMAGE CLASSIFICATION
    Pang, Yanwei
    Ma, Zhao
    Yuan, Yuan
    Li, Xuelong
    Wang, Kongqiao
    [J]. 2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1797 - 1800