Image caption generation method based on adaptive attention mechanism

被引:0
|
作者
Jin, Huazhong [1 ]
Wu, Yu [1 ]
Wan, Fang [1 ]
Hu, Man [1 ]
Li, Qingqing [1 ]
机构
[1] Hubei Univ Technol, Sch Comp, Wuhan, Hubei, Peoples R China
关键词
Image caption; adaptive attention; image feature; deep neural network;
D O I
10.1117/12.2539338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An image caption generation model with adaptive attention mechanism is proposed for dealing with the weakness of the image description model by the local image features. Under the framework of encoder and decoder architecture, the local and global features of images are extracted by using inception V3 and VGG19 network models at the encoder. Since the adaptive attention mechanism proposed in this paper can automatically identify and acquire the importance of local and global image information, the decoder can generate sentences describing the image more intuitively and accurately. The proposed model is trained and tested on Microsoft COCO dataset. The experimental results show that the proposed method can extract more abundant and complete information from the image and generate more accurate sentences, compared with the image caption model based on local features.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Image Caption Description Generation Method Based on Reflective Attention Mechanism
    Qiao Pingan
    Yuan, Li
    Shen Ruixue
    [J]. ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 600 - 609
  • [2] Image caption generation with dual attention mechanism
    Liu, Maofu
    Li, Lingjun
    Hu, Huijun
    Guan, Weili
    Tian, Jing
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
  • [3] Image caption generation using a dual attention mechanism
    Padate, Roshni
    Jain, Amit
    Kalla, Mukesh
    Sharma, Arvind
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [4] Image caption based on Visual Attention Mechanism
    Zhou, Jinfei
    Zhu, Yaping
    Pan, Hong
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO AND SIGNAL PROCESSING (IVSP 2019), 2019, : 28 - 32
  • [5] Improved method for image caption with global attention mechanism
    Ma, Shulei
    Zhang, Guobin
    Jiao, Yang
    Shi, Guangming
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (02): : 17 - 22
  • [6] Scene Attention Mechanism for Remote Sensing Image Caption Generation
    Wu, Shiqi
    Zhang, Xiangrong
    Wang, Xin
    Li, Chen
    Jiao, Licheng
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] Assamese news image caption generation using attention mechanism
    Das, Ringki
    Singh, Thoudam Doren
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (07) : 10051 - 10069
  • [8] Assamese news image caption generation using attention mechanism
    Ringki Das
    Thoudam Doren Singh
    [J]. Multimedia Tools and Applications, 2022, 81 : 10051 - 10069
  • [9] Bahdanau Attention Based Bengali Image Caption Generation
    Alam, Md Sahrial
    Rahman, Md Sayedur
    Hosen, Md Ikbal
    Mubin, Khairul Anam
    Hossen, Sharif
    Mridha, M. F.
    [J]. 2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 1073 - 1077
  • [10] Research for image caption based on global attention mechanism
    Tong, Wu
    Tao, Ku
    Hao, Zhang
    [J]. SECOND TARGET RECOGNITION AND ARTIFICIAL INTELLIGENCE SUMMIT FORUM, 2020, 11427