Collaborative fine-grained interaction learning for image-text sentiment analysis

被引:3
|
作者
Xiao, Xingwang [1 ]
Pu, Yuanyuan [1 ,2 ]
Zhou, Dongming [1 ]
Cao, Jinde [3 ]
Gu, Jinjing [1 ]
Zhao, Zhengpeng [1 ]
Xu, Dan [1 ]
机构
[1] Yunnan Univ, Coll Informat Sci & Engn, Kunming 650500, Peoples R China
[2] Univ Key Lab Internet Things Technol & Applicat, Kunming 650500, Yunnan, Peoples R China
[3] Southeast Univ, Coll Automat, Nanjing 210096, Peoples R China
关键词
Image-text sentiment analysis; Fine-grained interaction; Image-text dataset; Memory transformer;
D O I
10.1016/j.knosys.2023.110951
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Investigating interactions between image and text can effectively improve image-text sentiment analysis, but most existing methods do not explore image-text interaction at fine-grained level. In this paper, we propose a Memory-enhanced Collaborative Fine-grained Interaction Transformer (MCFIT) to learn collaborative fine-grained interaction between image and text. Specifically, a multi-branch encoder is designed to learn both fine-grained region-word and patch-word interactions. Meanwhile, Memory-enhanced Cross-Attention (MECA) is proposed to utilize patch and region information to improve region-word interaction and patch-word interaction learning, respectively. Therefore, collaborative fine-grained interaction can yield more accurate image-text interaction. Finally, to analyze the sentiments embedded in real-life Chinese image-text pairs, we build a large-scale Chinese image-text sentiment dataset (CISD) containing 54,931 image-text pairs. Extensive experiments conducted on four real-life datasets prove the effectiveness of collaborative fine-grained interaction and demonstrate that MCFIT outperforms the state-of-the-art baselines.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Fine-grained Sentiment Analysis Based on Sentiment Disambiguation
    Cai, Xiao-hong
    Liu, Pei-yu
    Wang, Zhi-hao
    Zhu, Zhen-fang
    2016 8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME), 2016, : 557 - 561
  • [22] Multi-level network based on transformer encoder for fine-grained image-text matching
    Yang, Lei
    Feng, Yong
    Zhou, Mingliang
    Xiong, Xiancai
    Wang, Yongheng
    Qiang, Baohua
    MULTIMEDIA SYSTEMS, 2023, 29 (04) : 1981 - 1994
  • [23] VSR plus plus : Improving Visual Semantic Reasoning for Fine-Grained Image-Text Matching
    Yuan, Hui
    Huang, Yan
    Zhang, Dongbo
    Chen, Zerui
    Cheng, Wenlong
    Wang, Liang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3728 - 3735
  • [24] Towards Fast and Accurate Image-Text Retrieval with Self-Supervised Fine-Grained Alignment
    Zhuang, Jiamin
    Yu, Jing
    Ding, Yang
    Qu, Xiangyan
    Hu, Yue
    arXiv, 2023,
  • [25] Memorize, Associate and Match: Embedding Enhancement via Fine-Grained Alignment for Image-Text Retrieval
    Li, Jiangtong
    Liu, Liu
    Niu, Li
    Zhang, Liqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 9193 - 9207
  • [26] A Fine-Grained Sentiment Analysis Method Using Transformer for Weibo Comment Text
    Xue, Piao
    Bai, Wei
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2024, 17 (01)
  • [27] Fine-Grained Image Analysis With Deep Learning: A Survey
    Wei, Xiu-Shen
    Song, Yi-Zhe
    Mac Aodha, Oisin
    Wu, Jianxin
    Peng, Yuxin
    Tang, Jinhui
    Yang, Jian
    Belongie, Serge
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 8927 - 8948
  • [28] Mutual Disentanglement Learning for Joint Fine-Grained Sentiment Classification and Controllable Text Generation
    Hao Fei
    Li, Chenliang
    Ji, Donghong
    Li, Fei
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1555 - 1565
  • [29] Fine-Grained Text Sentiment Transfer via Dependency Parsing
    Xiao, Lulu
    Qu, Xiaoye
    Li, Ruixuan
    Wang, Jun
    Zhou, Pan
    Li, Yuhua
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2228 - 2235
  • [30] Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training
    Qiu, Longtian
    Ning, Shan
    He, Xuming
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4605 - 4613