Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques

被引:0
|
作者
Xu, Meiling [1 ,2 ]
Abd Rahman, Hayati [1 ]
Li, Feng [1 ,2 ]
机构
[1] Univ Teknol MARA, Coll Comp Informat & Math, Shah Alam 40450, Malaysia
[2] Hebei Finance Univ, Coll Comp & Informat Engn, Baoding 071051, Peoples R China
关键词
Chinese text-image summaries; automated summary generation; deep learning; MaliGAN; cross-modal similarity retrieval; adaptive fusion strategy;
D O I
10.18280/ts.400644
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, thereby enhancing the efficiency of information consumption. Due to the unique characteristics of the Chinese language, traditional automatic summarization techniques are inadequately transferable, prompting the development of text-image summary generation technologies tailored to Chinese features. Research indicates that while existing natural language processing and deep learning techniques have made strides in text summarization, deficiencies remain in the deep semantic mining and integration of text-image content. This study primarily focuses on two aspects: Firstly, a generative approach based on an enhanced MaliGAN model, employing deep learning models to improve text generation quality. Secondly, a retrieval-based approach, utilizing cross-modal similarity retrieval to extract text information most relevant to the image content, guiding summary generation. Additionally, this study innovatively proposes a model architecture comprising segmentation, cross-modal retrieval, and adaptive fusion strategy modules, significantly augmenting the accuracy and reliability of text-image summary generation.
引用
收藏
页码:2835 / 2843
页数:9
相关论文
共 50 条
  • [1] Sentiment Analysis of Image with Text Caption using Deep Learning Techniques
    Chaubey, Pavan Kumar
    Arora, Tarun Kumar
    Raj, K. Bhavana
    Asha, G. R.
    Mishra, Geetishree
    Guptav, Suresh Chand
    Altuwairiqi, Majid
    Alhassan, Musah
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [2] Multimodal Deep Learning Framework for Sentiment Analysis from Text-Image Web Data
    Thuseethan, Selvarajah
    Janarthan, Sivasubramaniam
    Rajasegarar, Sutharshan
    Kumari, Priya
    Yearwood, John
    2020 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2020), 2020, : 267 - 274
  • [3] A Learning to Rank framework applied to text-image retrieval
    David Buffoni
    Sabrina Tollari
    Patrick Gallinari
    Multimedia Tools and Applications, 2012, 60 : 161 - 180
  • [4] A Learning to Rank framework applied to text-image retrieval
    Buffoni, David
    Tollari, Sabrina
    Gallinari, Patrick
    MULTIMEDIA TOOLS AND APPLICATIONS, 2012, 60 (01) : 161 - 180
  • [5] Summarization of Text and Image Captioning in Information Retrieval Using Deep Learning Techniques
    Mahalakshmi, P.
    Fatima, N. Sabiyath
    IEEE ACCESS, 2022, 10 : 18289 - 18297
  • [6] Text-image conditioned diffusion for consistent text-to-3D generation
    He, Yuze
    Bai, Yushi
    Lin, Matthieu
    Sheng, Jenny
    Hu, Yubin
    Wang, Qi
    Wen, Yu-Hui
    Liu, Yong-Jin
    COMPUTER AIDED GEOMETRIC DESIGN, 2024, 111
  • [7] A text image generation model based on deep learning
    Wang, Jing
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (03) : 4979 - 4989
  • [8] A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection
    Seo, Jangwon
    Hwang, Hyo-Seok
    Lee, Jiyoung
    Lee, Minhyeok
    Kim, Wonsuk
    Seok, Junhee
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1312 - 1316
  • [9] Automated Sorter and Grading of Tomatoes using Image Analysis and Deep Learning Techniques
    Fred Bautista, Justine
    Dave Ocena, Christian
    Jason Cabreros, Mher
    Alagao, Stephen Paul L.
    2020 IEEE 12TH INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY, COMMUNICATION AND CONTROL, ENVIRONMENT, AND MANAGEMENT (HNICEM), 2020,
  • [10] Automated Generation of Clinical Reports Using Sensing Technologies with Deep Learning Techniques
    Cabello-Collado, Celia
    Rodriguez-Juan, Javier
    Ortiz-Perez, David
    Garcia-Rodriguez, Jose
    Tomas, David
    Vizcaya-Moreno, Maria Flores
    SENSORS, 2024, 24 (09)