Abstractive Text-Image Summarization Using Multi-Modal Attentional Hierarchical RNN

被引:0
|
作者
Chen, Jingqiang [1 ]
Hai Zhuge [1 ,2 ,3 ,4 ]
机构
[1] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China
[2] Aston Univ, Birmingham, W Midlands, England
[3] Guangzhou Univ, Guangzhou, Peoples R China
[4] Chinese Acad Sci, Univ Chinese Acad Sci, Key Lab Intelligent Informat Proc, ICT, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rapid growth of multi-modal documents on the Internet makes multi-modal summarization research necessary. Most previous research summarizes texts or images separately. Recent neural summarization research shows the strength of the Encoder-Decoder model in text summarization. This paper proposes an abstractive text-image summarization model using the attentional hierarchical Encoder-Decoder model to summarize a text document and its accompanying images simultaneously, and then to align the sentences and images in summaries. A multi-modal attentional mechanism is proposed to attend original sentences, images, and captions when decoding. The DailyMail dataset is extended by collecting images and captions from the Web. Experiments show our model outperforms the neural abstractive and extractive text summarization methods that do not consider images. In addition, our model can generate informative summaries of images.
引用
收藏
页码:4046 / 4056
页数:11
相关论文
共 50 条
  • [41] A Multi-Modal Medical Image Analysis Algorithm Based on Text Guidance
    Fan, Lin
    Gong, Xun
    Zheng, Cen-Yang
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (07): : 2341 - 2355
  • [42] Bengali abstractive text summarization using sequence to sequence RNNs
    Talukder, Md Ashraful Islam
    Abujar, Sheikh
    Masum, Abu Kaisar Mohammad
    Faisal, Fahad
    Hossain, Syed Akhter
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [43] Review on Abstractive Text Summarization Techniques (ATST) for single and multi documents
    Modi, Shivangi
    Oza, Rachana
    [J]. 2018 INTERNATIONAL CONFERENCE ON COMPUTING, POWER AND COMMUNICATION TECHNOLOGIES (GUCON), 2018, : 1173 - 1176
  • [44] Multi-modal and multi-scale photo collection summarization
    Xu Shen
    Xinmei Tian
    [J]. Multimedia Tools and Applications, 2016, 75 : 2527 - 2541
  • [45] Multi-modal and multi-scale photo collection summarization
    Shen, Xu
    Tian, Xinmei
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (05) : 2527 - 2541
  • [46] Abstractive Text Summarization Using T5 Architecture
    Ramesh, G. S.
    Manyam, Vamsi
    Mandula, Vijoosh
    Myana, Pavan
    Macha, Sathvika
    Reddy, Suprith
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND COMMUNICATION SYSTEMS, ICACECS 2021, 2022, : 535 - 543
  • [47] Multi-Modal Supplementary-Complementary Summarization using Multi-Objective Optimization
    Jangra, Anubhav
    Saha, Sriparna
    Jatowt, Adam
    Hasanuzzaman, Mohammed
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 818 - 828
  • [48] Abstractive text summarization using deep learning with a new Turkish summarization benchmark dataset
    Ertam, Fatih
    Aydin, Galip
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (09):
  • [49] MULTI-MODAL MACHINE LEARNING FOR VEHICLE RATING PREDICTIONS USING IMAGE, TEXT, AND PARAMETRIC DATA
    Su, Hanqi
    Song, Binyang
    Ahmed, Faez
    [J]. PROCEEDINGS OF ASME 2023 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2023, VOL 2, 2023,
  • [50] Extractive text-image summarization with relation-enhanced graph attention network
    Feng Xie
    Jingqiang Chen
    Kejia Chen
    [J]. Journal of Intelligent Information Systems, 2023, 61 : 325 - 341