Reinforcement Learning Transformer for Image Captioning Generation Model

被引:0
|
作者
Yan, Zhaojie [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
关键词
Image captioning; transformer; reinforcement learning; reward dynamics backpropagation;
D O I
10.1117/12.2680670
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning generation is a combination of the visual domain and natural language processing. The transformer framework has become the mainstream approach. This paper combines reinforcement learning and transformer methods to reward dynamics backpropagation and normalization in the testing phase. Its characteristic is that when the steps of reinforcement learning increase, the agent model has more knowledge of the fully information, which reduces the computing cost of the system. The experimental results show that the reinforcement transformer structure has achieved a certain improvement in speed.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Video captioning based on vision transformer and reinforcement learning
    Zhao, Hong
    Chen, Zhiwen
    Guo, Lan
    Han, Zeyu
    [J]. PeerJ Computer Science, 2022, 8
  • [2] Video captioning based on vision transformer and reinforcement learning
    Zhao, Hong
    Chen, Zhiwen
    Guo, Lan
    Han, Zeyu
    [J]. PEERJ COMPUTER SCIENCE, 2022, 8
  • [3] Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
    Hossain, Md Anwar
    Hasan, Mirza A. F. M. Rashidul
    Hossen, Ebrahim
    Asraful, Md
    Faruk, Md Omar
    Abadin, A. F. M. Zainul
    Ali, Md Suhag
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1110 - 1117
  • [4] Remote sensing image caption generation via transformer and reinforcement learning
    Shen, Xiangqing
    Liu, Bing
    Zhou, Yong
    Zhao, Jiaqi
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (35-36) : 26661 - 26682
  • [5] Remote sensing image caption generation via transformer and reinforcement learning
    Xiangqing Shen
    Bing Liu
    Yong Zhou
    Jiaqi Zhao
    [J]. Multimedia Tools and Applications, 2020, 79 : 26661 - 26682
  • [6] Image Captioning using Reinforcement Learning with BLUDEr Optimization
    Devi, P. R.
    Thrivikraman, V
    Kashyap, D.
    Shylaja, S. S.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2020, 30 (04) : 607 - 613
  • [7] Image Captioning using Adversarial Networks and Reinforcement Learning
    Yan, Shiyang
    Wu, Fangyu
    Smith, Jeremy S.
    Lu, Wenjin
    Zhang, Bailing
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 248 - 253
  • [8] Image Captioning using Reinforcement Learning with BLUDEr Optimization
    P. R. Devi
    V. Thrivikraman
    D. Kashyap
    S. S. Shylaja
    [J]. Pattern Recognition and Image Analysis, 2020, 30 : 607 - 613
  • [9] Transformer based Multitask Learning for Image Captioning and Object Detection
    Basak, Debolena
    Srijith, P. K.
    Desarkar, Maunendra Sankar
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PAKDD 2024, 2024, 14646 : 260 - 272
  • [10] Distance Transformer for Image Captioning
    Wang, Jiarong
    Lu, Tongwei
    Liu, Xuanxuan
    Yang, Qi
    [J]. 2021 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION ENGINEERING (RCAE 2021), 2021, : 73 - 76