Multi-channel weighted fusion for image captioning

被引:0
|
作者
Jingyue Zhong
Yang Cao
Yina Zhu
Jie Gong
Qiaosen Chen
机构
[1] South China Normal University,School of Computer Science
来源
The Visual Computer | 2023年 / 39卷
关键词
Image captioning; Multi-channel encoder; Weighted fusion; Dimension reduction;
D O I
暂无
中图分类号
学科分类号
摘要
Automatically describing the detail and content of the image is a meaningful but difficult task. In this paper, we propose a variety of optimization improvements to enhance the encoder and decoder for image captioning, called multi-channel weighted fusion. In the presented model, we propose multi-channel encoder which is able to extract different features of the same image by combining various models and algorithms. In order to avoid dimensional explosion caused by multi-channel encoder, we employ the reducing multilayer perceptron to reduce the dimension and discuss how to train the reducing multilayer perceptron. For the decoder part, we discuss how the decoder receives features from different channels and propose a technique for fusing independent and identically typed decoders. To get a better description generated by the decoder, we exploit the voting weight strategy for decoder fusion and explore the entropy function to choose the best distribution. The experiment on datasets Flickr-8k, Flickr-30k and MS COCO demonstrates that the proposed model is compatible with most features with low error rate. For instance, our model is specifically outstanding on METEOR score.
引用
收藏
页码:6115 / 6132
页数:17
相关论文
共 50 条
  • [21] Warm start of multi-channel weighted nuclear norm minimization for color image denoising
    Guo, Xue
    Liu, Feng
    Chen, Yiting
    Tian, Xuetao
    IAENG International Journal of Computer Science, 2019, 46 (04): : 1 - 7
  • [22] Sonification of multi-channel image data
    Hermann, T
    Nattkemper, TW
    Ritter, H
    Schubert, W
    METMBS'00: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, VOLS I AND II, 2000, : 745 - 750
  • [23] Multi-Channel Quantum Image Scrambling
    Yan, Fei
    Guo, Yiming
    Iliyasu, Abdullah M.
    Jiang, Zhengang
    Yang, Huamin
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (01) : 163 - 170
  • [24] A multi-channel framework for image watermarking
    Zheng, JB
    Feng, DD
    Zhao, RC
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 5099 - 5104
  • [25] Joint Multi-Shot Multi-Channel Image Reconstruction in Compressive Diffusion Weighted MR Imaging
    Zhang, Hao
    Chen, Yunmei
    Pasiliao, Eduardo, Jr.
    Huang, Feng
    MEDICAL IMAGING 2015: IMAGE PROCESSING, 2015, 9413
  • [26] Fusion-UWnet: Multi-channel Fusion-based Deep CNN for Underwater Image Enhancement
    Pradhan, Pious
    Mazumder, Alokendu
    Mandal, Srimanta
    Subudhi, Badri N.
    OCEANS 2021: SAN DIEGO - PORTO, 2021,
  • [27] Multi-level Visual Fusion Networks for Image Captioning
    Zhou, Dongming
    Zhang, Canlong
    Li, Zhixin
    Wang, Zhiwen
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [28] Image Captioning Model Based on Multi Level Visual Fusion
    Zhou D.-M.
    Zhang C.-L.
    Li Z.-X.
    Wang Z.-W.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (07): : 1286 - 1290
  • [29] Multi-channel deep image prior for image denoising
    Xu, Shaoping
    Xiao, Nan
    Luo, Jie
    Zhou, Changfei
    Xiong, Minghai
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (08) : 4395 - 4404
  • [30] Multi-channel deep image prior for image denoising
    Shaoping Xu
    Nan Xiao
    Jie Luo
    Changfei Zhou
    Minghai Xiong
    Signal, Image and Video Processing, 2023, 17 : 4395 - 4404