Multi-channel weighted fusion for image captioning

被引:0
|
作者
Jingyue Zhong
Yang Cao
Yina Zhu
Jie Gong
Qiaosen Chen
机构
[1] South China Normal University,School of Computer Science
来源
The Visual Computer | 2023年 / 39卷
关键词
Image captioning; Multi-channel encoder; Weighted fusion; Dimension reduction;
D O I
暂无
中图分类号
学科分类号
摘要
Automatically describing the detail and content of the image is a meaningful but difficult task. In this paper, we propose a variety of optimization improvements to enhance the encoder and decoder for image captioning, called multi-channel weighted fusion. In the presented model, we propose multi-channel encoder which is able to extract different features of the same image by combining various models and algorithms. In order to avoid dimensional explosion caused by multi-channel encoder, we employ the reducing multilayer perceptron to reduce the dimension and discuss how to train the reducing multilayer perceptron. For the decoder part, we discuss how the decoder receives features from different channels and propose a technique for fusing independent and identically typed decoders. To get a better description generated by the decoder, we exploit the voting weight strategy for decoder fusion and explore the entropy function to choose the best distribution. The experiment on datasets Flickr-8k, Flickr-30k and MS COCO demonstrates that the proposed model is compatible with most features with low error rate. For instance, our model is specifically outstanding on METEOR score.
引用
收藏
页码:6115 / 6132
页数:17
相关论文
共 50 条
  • [41] Multi-channel vibration information weighted fusion for fault feature extraction of rotating machinery main bearings
    Xiaochi, Luan
    Junhao, Zhao
    Yundong, Sha
    Xinhang, Liu
    Zhihao, Lei
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2025, 228
  • [42] Infrared and visible image fusion of generative adversarial network based on multi-channel encoding and decoding
    Ji, Jingyu
    Zhao, Yuefei
    Zhang, Yuhua
    Wang, Changlong
    Ma, Xiaolin
    Huang, Fuyu
    Yao, Jiangyi
    INFRARED PHYSICS & TECHNOLOGY, 2023, 134
  • [43] Full-Reference Image Quality Assessment Based on Multi-Channel Visual Information Fusion
    Jiang, Benchi
    Bian, Shilei
    Shi, Chenyang
    Wu, Lulu
    APPLIED SCIENCES-BASEL, 2023, 13 (15):
  • [44] Low-light image enhancement algorithm based on multi-channel fusion attention network
    Chen Q.
    Gu Y.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (14): : 2111 - 2122
  • [45] A Novel Medical Image Fusion Method Using Multi-Channel Pulse Coupled Neural Networks
    Li, Yi
    Zhao, Junli
    IEEE ACCESS, 2020, 8 : 157572 - 157586
  • [46] Skeletal joint image-based multi-channel fusion network for human activity recognition
    Sun, Tianang
    Lian, Chao
    Dong, Fanghecong
    Shao, Jinliang
    Zhang, Xinyue
    Xiao, Qijun
    Ju, Zhongjie
    Zhao, Yuliang
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [47] Multi-Channel Fusion for Seismic Event Detection and Classification
    Lindenbaum, Ofir
    Rabin, Neta
    Bregman, Yuri
    2016 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING (ICSEE), 2016,
  • [48] Multi-channel time-frequency data fusion
    Aarabi, P
    Shi, G
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOL I, 2002, : 404 - 411
  • [49] Research on Multi-Channel Semantic Fusion Classification Model
    Yang, Di
    Qiu, Ningjia
    Cong, Lin
    Yang, Huamin
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2019, 23 (06) : 1044 - 1051
  • [50] Multi-channel time-frequency fusion attacks
    Cao Y.
    Zhou Y.
    Zhang H.
    International Journal of Information and Computer Security, 2021, 16 (1-2) : 84 - 102