Image Caption Generation Using Attention Model

被引：0

作者：

Ramalakshmi, Eliganti ^{[1
]}

Jain, Moksh Sailesh ^{[1
]}

Uddin, Mohammed Ameer ^{[1
]}

机构：

[1] Chaitanya Bharathi Inst Technol, Hyderabad, India

来源：

INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, ICIDCA 2021 | 2022年 / 96卷

关键词：

Attention model; Encoder-decoder architecture; Transfer learning; Interpretability enhancement model;

D O I：

10.1007/978-981-16-7167-8_74

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The process of generating a caption for a given image using the techniques of computer vision and natural language processing is called image caption generation. During recent times, many deep learning models have been used to increase the performance of the caption generating models. But the drawback of these models is that they lack proper focus on the pertinent part of the image while generating the caption which leads to a vague caption generation. To get the better of these drawbacks, we are proposing a model, which gives a caption by selecting pertinent objects in a particular image and providing a perceivable explanation using them.

引用

页码：1009 / 1017

页数：9

共 50 条

[1] Image caption generation using a dual attention mechanism
Padate, Roshni
Jain, Amit
Kalla, Mukesh
Sharma, Arvind
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
[2] Recurrent Attention LSTM Model for Image Chinese Caption Generation
Zhang, Chaoying
Dai, Yaping
Cheng, Yanyan
Jia, Zhiyang
Hirota, Kaoru
[J]. 2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 808 - 813
[3] Assamese news image caption generation using attention mechanism
Das, Ringki
Singh, Thoudam Doren
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (07) : 10051 - 10069
[4] Assamese news image caption generation using attention mechanism
Ringki Das
Thoudam Doren Singh
[J]. Multimedia Tools and Applications, 2022, 81 : 10051 - 10069
[5] Image caption generation with dual attention mechanism
Liu, Maofu
Li, Lingjun
Hu, Huijun
Guan, Weili
Tian, Jing
[J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (02)
[6] Clothes image caption generation with attribute detection and visual attention model
Li, Xianrui
Ye, Zhiling
Zhang, Zhao
Zhao, Mingbo
[J]. PATTERN RECOGNITION LETTERS, 2021, 141 (141) : 68 - 74
[7] Automatic image caption generation using deep learning and multimodal attention
Dai, Jin
Zhang, Xinyu
[J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[8] Bahdanau Attention Based Bengali Image Caption Generation
Alam, Md Sahrial
Rahman, Md Sayedur
Hosen, Md Ikbal
Mubin, Khairul Anam
Hossen, Sharif
Mridha, M. F.
[J]. 2022 INTERNATIONAL CONFERENCE ON DECISION AID SCIENCES AND APPLICATIONS (DASA), 2022, : 1073 - 1077
[9] Fine-grained attention for image caption generation
Chang, Yan-Shuo
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (03) : 2959 - 2971
[10] Cross-Lingual Image Caption Generation Based on Visual Attention Model
Wang, Bin
Wang, Cungang
Zhang, Qian
Su, Ying
Wang, Yang
Xu, Yanyan
[J]. IEEE ACCESS, 2020, 8 : 104543 - 104554

← 1 2 3 4 5 →