Multi-channel weighted fusion for image captioning

被引:0
|
作者
Jingyue Zhong
Yang Cao
Yina Zhu
Jie Gong
Qiaosen Chen
机构
[1] South China Normal University,School of Computer Science
来源
The Visual Computer | 2023年 / 39卷
关键词
Image captioning; Multi-channel encoder; Weighted fusion; Dimension reduction;
D O I
暂无
中图分类号
学科分类号
摘要
Automatically describing the detail and content of the image is a meaningful but difficult task. In this paper, we propose a variety of optimization improvements to enhance the encoder and decoder for image captioning, called multi-channel weighted fusion. In the presented model, we propose multi-channel encoder which is able to extract different features of the same image by combining various models and algorithms. In order to avoid dimensional explosion caused by multi-channel encoder, we employ the reducing multilayer perceptron to reduce the dimension and discuss how to train the reducing multilayer perceptron. For the decoder part, we discuss how the decoder receives features from different channels and propose a technique for fusing independent and identically typed decoders. To get a better description generated by the decoder, we exploit the voting weight strategy for decoder fusion and explore the entropy function to choose the best distribution. The experiment on datasets Flickr-8k, Flickr-30k and MS COCO demonstrates that the proposed model is compatible with most features with low error rate. For instance, our model is specifically outstanding on METEOR score.
引用
收藏
页码:6115 / 6132
页数:17
相关论文
共 50 条
  • [1] Multi-channel weighted fusion for image captioning
    Zhong, Jingyue
    Cao, Yang
    Zhu, Yina
    Gong, Jie
    Chen, Qiaosen
    VISUAL COMPUTER, 2023, 39 (12): : 6115 - 6132
  • [2] Progress in Multi-Channel Image Fusion for Face Image Matching
    DelMarco, Stephen
    MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2013, 2013, 8755
  • [3] A novel fusion paradigm for multi-channel image denoising
    Wu, Yue
    Li, Shutao
    INFORMATION FUSION, 2022, 77 : 62 - 69
  • [4] Normalized weighted cross correlation for multi-channel image registration
    Ayubi, Gaston A.
    Kowalski, Bartlomiej
    Dubra, Alfredo
    OPTICS CONTINUUM, 2024, 3 (05): : 649 - 665
  • [5] Color image superresolution using multi-channel data fusion
    Zhao, SB
    Han, H
    Peng, SL
    THIRD INTERNATIONAL SYMPOSIUM ON MULTISPECTRAL IMAGE PROCESSING AND PATTERN RECOGNITION, PTS 1 AND 2, 2003, 5286 : 39 - 44
  • [6] Unrolling Multi-channel Weighted Nuclear Norm Minimization for Image Denoising
    Pham, Thuy Thi
    Mai, Truong Thanh Nhat
    Lee, Chul
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 243 - 244
  • [7] Multi-Channel Fusion Attacks
    Yang, Wei
    Zhou, Yongbin
    Cao, Yuchen
    Zhang, Hailong
    Zhang, Qian
    Wang, Huan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2017, 12 (08) : 1757 - 1771
  • [8] A multi-channel neural network model for multi-focus image fusion
    Qi, Yunliang
    Yang, Zhen
    Lu, Xiangyu
    Li, Shouliang
    Ma, Yide
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 247
  • [9] Gearbox fault diagnosis based on transfer learning and weighted multi-channel fusion
    Hou Z.
    Wang H.
    Xiong M.
    Wang J.
    Zhendong yu Chongji/Journal of Vibration and Shock, 2023, 42 (09): : 236 - 246
  • [10] Multi-channel satellite cloud image fusion in the tetrolet transform domain
    Zhang, Chang-Jiang
    Chen, Yuan
    Duanmu, Chunjiang
    Feng, Hua-Jun
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2014, 35 (24) : 8138 - 8168