Multi-channel weighted fusion for image captioning

被引:0
|
作者
Jingyue Zhong
Yang Cao
Yina Zhu
Jie Gong
Qiaosen Chen
机构
[1] South China Normal University,School of Computer Science
来源
The Visual Computer | 2023年 / 39卷
关键词
Image captioning; Multi-channel encoder; Weighted fusion; Dimension reduction;
D O I
暂无
中图分类号
学科分类号
摘要
Automatically describing the detail and content of the image is a meaningful but difficult task. In this paper, we propose a variety of optimization improvements to enhance the encoder and decoder for image captioning, called multi-channel weighted fusion. In the presented model, we propose multi-channel encoder which is able to extract different features of the same image by combining various models and algorithms. In order to avoid dimensional explosion caused by multi-channel encoder, we employ the reducing multilayer perceptron to reduce the dimension and discuss how to train the reducing multilayer perceptron. For the decoder part, we discuss how the decoder receives features from different channels and propose a technique for fusing independent and identically typed decoders. To get a better description generated by the decoder, we exploit the voting weight strategy for decoder fusion and explore the entropy function to choose the best distribution. The experiment on datasets Flickr-8k, Flickr-30k and MS COCO demonstrates that the proposed model is compatible with most features with low error rate. For instance, our model is specifically outstanding on METEOR score.
引用
收藏
页码:6115 / 6132
页数:17
相关论文
共 50 条
  • [31] Memristor-based multi-channel pulse coupled neural network for image fusion
    Jian L.
    Chengmao W.
    Xiaoping T.
    Journal of China Universities of Posts and Telecommunications, 2020, 27 (06): : 54 - 72
  • [32] Memristor-based multi-channel pulse coupled neural network for image fusion
    Liu Jian
    Wu Chengmao
    Tian Xiaoping
    The Journal of China Universities of Posts and Telecommunications, 2020, 27 (06) : 54 - 72
  • [33] A Kind of Multi-channel Filtering Based Wavelet Packet Remote Sensing Image Fusion
    Dong, Surong
    Zhou, Haiying
    ADVANCED MATERIALS IN MICROWAVES AND OPTICS, 2012, 500 : 748 - +
  • [34] Deep-sea image stitching: Using multi-channel fusion and improved AKAZE
    Yuan, Ping
    Fan, Chunling
    Zhang, Chuntang
    IET IMAGE PROCESSING, 2023, 17 (14) : 4061 - 4075
  • [35] Quality assessment towards cell diffraction image based on multi-channel feature fusion
    Zhang, Xikun
    Hou, Jie
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 64
  • [36] Panchromatic and multi-spectral image fusion for new satellites based on multi-channel deep model
    Guiqing He
    Siyuan Xing
    Zhaoqiang Xia
    Qingqing Huang
    Jianping Fan
    Machine Vision and Applications, 2018, 29 : 933 - 946
  • [37] Panchromatic and multi-spectral image fusion for new satellites based on multi-channel deep model
    He, Guiqing
    Xing, Siyuan
    Xia, Zhaoqiang
    Huang, Qingqing
    Fan, Jianping
    MACHINE VISION AND APPLICATIONS, 2018, 29 (06) : 933 - 946
  • [38] Perceptual evaluation of weighted multi-channel binaural format
    Rio, E
    Vandernoot, G
    Warusfel, O
    DAFX-03: 6TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, PROCEEDINGS, 2003, : 184 - 187
  • [39] A comprehensive approach for multi-channel image registration
    Rohde, GK
    Pajevic, S
    Pierpaoli, C
    Basser, PJ
    BIOMEDICAL IMAGE REGISTRATION, 2003, 2717 : 214 - 223
  • [40] Multi-channel model for sonar image segmentation
    Cexus, JC
    Boudraa, AO
    SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 2, PROCEEDINGS, 2003, : 631 - 632