Abstractive Text-Image Summarization Using Multi-Modal Attentional Hierarchical RNN

被引:0
|
作者
Chen, Jingqiang [1 ]
Hai Zhuge [1 ,2 ,3 ,4 ]
机构
[1] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China
[2] Aston Univ, Birmingham, W Midlands, England
[3] Guangzhou Univ, Guangzhou, Peoples R China
[4] Chinese Acad Sci, Univ Chinese Acad Sci, Key Lab Intelligent Informat Proc, ICT, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rapid growth of multi-modal documents on the Internet makes multi-modal summarization research necessary. Most previous research summarizes texts or images separately. Recent neural summarization research shows the strength of the Encoder-Decoder model in text summarization. This paper proposes an abstractive text-image summarization model using the attentional hierarchical Encoder-Decoder model to summarize a text document and its accompanying images simultaneously, and then to align the sentences and images in summaries. A multi-modal attentional mechanism is proposed to attend original sentences, images, and captions when decoding. The DailyMail dataset is extended by collecting images and captions from the Web. Experiments show our model outperforms the neural abstractive and extractive text summarization methods that do not consider images. In addition, our model can generate informative summaries of images.
引用
收藏
页码:4046 / 4056
页数:11
相关论文
共 50 条
  • [21] A Multi-Modal Topic Model for Image Annotation Using Text Analysis
    Tian, Jing
    Huang, Yu
    Guo, Zhi
    Qi, Xiang
    Chen, Ziyan
    Huang, Tinglei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (07) : 886 - 890
  • [22] Japanese abstractive text summarization using BERT
    Iwasaki, Yuuki
    Yamashita, Akihiro
    Konno, Yoko
    Matsubayashi, Katsushi
    [J]. 2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2019,
  • [23] Abstractive Text Summarization Using Multimodal Information
    Rafi, Shaik
    Das, Ranjita
    [J]. 2023 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2023, : 141 - 145
  • [24] Multi-Fact Correction in Abstractive Text Summarization
    Dong, Yue
    Wang, Shuohang
    Gan, Zhe
    Cheng, Yu
    Cheung, Jackie Chi Kit
    Liu, Jingjing
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9320 - 9331
  • [25] Abstractive Text Summarization with Multi-Head Attention
    Li, Jinpeng
    Zhang, Chuang
    Chen, Xiaojun
    Cao, Yanan
    Liao, Pengcheng
    Zhang, Peng
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [26] MIGT: Multi-modal image inpainting guided with text
    Li, Ailin
    Zhao, Lei
    Zuo, Zhiwen
    Wang, Zhizhong
    Xing, Wei
    Lu, Dongming
    [J]. NEUROCOMPUTING, 2023, 520 : 376 - 385
  • [27] Image and Encoded Text Fusion for Multi-Modal Classification
    Gallo, I.
    Calefati, A.
    Nawaz, S.
    Janjua, M. K.
    [J]. 2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, : 203 - 209
  • [28] D-MmT: A concise decoder-only multi-modal transformer for abstractive summarization in videos
    Liu, Nayu
    Sun, Xian
    Yu, Hongfeng
    Zhang, Wenkai
    Xu, Guangluan
    [J]. NEUROCOMPUTING, 2021, 456 : 179 - 189
  • [29] ATSSI: Abstractive Text Summarization using Sentiment Infusion
    Bhargava, Rupal
    Sharma, Yashvardhan
    Sharma, Gargi
    [J]. TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 : 404 - 411
  • [30] Abstractive Text Summarization Using Enhanced Attention Model
    Roul, Rajendra Kumar
    Joshi, Pratik Madhav
    Sahoo, Jajati Keshari
    [J]. INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 63 - 76