Image Captioning using Reinforcement Learning with BLUDEr Optimization

被引：8

作者：

Devi, P. R. ^{[1
]}

Thrivikraman, V ^{[1
]}

Kashyap, D. ^{[1
]}

Shylaja, S. S. ^{[1
]}

机构：

[1] PES Univ, Dept Comp Sci & Engn, Bangalore 560085, Karnataka, India

来源：

PATTERN RECOGNITION AND IMAGE ANALYSIS | 2020年 / 30卷 / 04期

关键词：

image captioning; reinforcement learning; BLUDEr; self-critical sequence training; policy-gradient; deep learning; ResNet; long-short term memory; attention; BLEU; CIDEr;

D O I：

10.1134/S1054661820040094

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Image captioning is a growing field of research that has taken hold of the research community. It is a challenging task owing to the complexity of natural language generation and the difficulty involved in feature extraction from a diverse collection of images. Many models have been proposed to tackle the problem, like state-of-the-art encoder-decoder (Sequential CNN-RNN) systems that have proved to be capable of obtaining results. Recently, Reinforcement learning has made itself the new approach to the problem and has been successful in surpassing many of the state-of-the-art paradigms. We have come up with a new reward system known as the BLUDEr metric, which is a linear combination of the non-differentiable metrics BLEU and CIDEr. We directly optimize this metric for our model, on natural language generation tasks. In our experiments, we use the Flickr30k and Flickr8k datasets, which have become two of the benchmark datasets when it comes to image captioning systems. We have achieved state-of-the-art results on these two datasets, when compared with other models.

引用

页码：607 / 613

页数：7

共 50 条

[1] Image Captioning using Reinforcement Learning with BLUDEr Optimization
P. R. Devi
V. Thrivikraman
D. Kashyap
S. S. Shylaja
[J]. Pattern Recognition and Image Analysis, 2020, 30 : 607 - 613
[2] Image Captioning using Adversarial Networks and Reinforcement Learning
Yan, Shiyang
Wu, Fangyu
Smith, Jeremy S.
Lu, Wenjin
Zhang, Bailing
[J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 248 - 253
[3] Reinforcement Learning Transformer for Image Captioning Generation Model
Yan, Zhaojie
[J]. FIFTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION, ICMV 2022, 2023, 12701
[4] Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
Honda, Ukyo
Watanabe, Taro
Matsumoto, Yuji
[J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1124 - 1134
[5] Image Captioning: From Encoder-Decoder to Reinforcement Learning
Tang, Yu
[J]. 2022 6TH INTERNATIONAL CONFERENCE ON IMAGING, SIGNAL PROCESSING AND COMMUNICATIONS, ICISPC, 2022, : 6 - 10
[6] Image Captioning using Deep Learning
Jain, Yukti Sanjay
Dhopeshwar, Tanisha
Chadha, Supreet Kaur
Pagire, Vrushali
[J]. 2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021,
[7] Image Captioning Using Deep Learning
Adithya, Paluvayi Veera
Kalidindi, Mourya Viswanadh
Swaroop, Nallani Jyothi
Vishwas, H. N.
[J]. ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT III, 2024, 2092 : 42 - 58
[8] Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning
Shen, Xiangqing
Liu, Bing
Zhou, Yong
Zhao, Jiaqi
Liu, Mingming
[J]. KNOWLEDGE-BASED SYSTEMS, 2020, 203
[9] Multi-Level Policy and Reward Reinforcement Learning for Image Captioning
Liu, An-An
Xu, Ning
Zhang, Hanwang
Nie, Weizhi
Su, Yuting
Zhang, Yongdong
[J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 821 - 827
[10] Improving Reinforcement Learning Based Image Captioning with Natural Language Prior
Guo, Tszhang
Chang, Shiyu
Yu, Mo
Bai, Kun
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 751 - 756

← 1 2 3 4 5 →