Automated Image Captioning Using Sparrow Search Algorithm With Improved Deep Learning Model

被引：0

作者：

Arasi, Munya A. ^{[1
]}

Alshahrani, Haya Mesfer ^{[2
]}

Alruwais, Nuha ^{[3
]}

Motwakel, Abdelwahed ^{[4
]}

Ahmed, Noura Abdelaziz ^{[5
]}

Mohamed, Abdullah ^{[6
]}

机构：

[1] King Khalid Univ, Coll Sci & Arts Rijal Almaa, Dept Comp Sci, Abha 62529, Saudi Arabia

[2] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Syst, POB 84428, Riyadh 11671, Saudi Arabia

[3] King Saud Univ, Coll Appl Studies & Community Serv, Dept Comp Sci & Engn, POB 22459, Riyadh 11495, Saudi Arabia

[4] Prince Sattam Bin Abdulaziz Univ, Coll Business Adm Hawtat Bani Tamim, Dept Management Informat Syst, Al Kharj 11942, Saudi Arabia

[5] Prince Sattam Bin Abdulaziz Univ, Dept Comp & Self Dev, Preparatory Year Deanship, Al Kharj 11942, Saudi Arabia

[6] Future Univ Egypt, Res Ctr, New Cairo 11845, Egypt

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Convolutional neural networks; Visualization; Feature extraction; Convolution; Deep learning; Natural language processing; Computational modeling; Image capture; Search methods; Image captioning; deep learning; natural language processing; sparrow search algorithm; computer vision;

D O I：

10.1109/ACCESS.2023.3317276

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image captioning is a deep learning technique that intends to create and generate textual descriptions or captions for images. It integrates computer vision and natural language processing (NLP) to comprehend the visual content of an image and generate human-like descriptions. Deep learning (DL) based image captioning models can be trained on large-scale datasets, allowing them to generalize various types of images and generate captions that apply to a wide range of visual scenarios. By combining computer vision and natural language processing, DL-enabled image captioning models can understand both visual and textual information, which enables them to generate captions that not only describe the visual content but also incorporate contextual and semantic information. This study develops an Automated Image Captioning using Sparrow Search Algorithm with Improved Deep Learning (AIC-SSAIDL) technique. The major intention of the AIC-SSAIDL technique lies in the automated generation of textual captions for the input images. To accomplish this, the AIC-SSAIDL technique utilizes the MobileNetv2 model to generate feature descriptors of the input images and its hyperparameter tuning process takes place using SSA. For the image captioning process, the AIC-SSAIDL technique utilizes an attention mechanism with long short-term memory (AM-LSTM) network. Finally, the hyperparameter selection of the AM-LSTM model is performed by the fruit fly optimization (FFO) algorithm. A wide range of experiments has been conducted on benchmark data to depict the better performance of the AIC-SSAIDL method. The comprehensive result analysis highlighted the enhanced captioning results of the AIC-SSAIDL method with maximum CIDEr of 46.12, 61.89, and 137.45 on Flickr8k, Flickr30k, and MSCOCO datasets, respectively.

引用

页码：104633 / 104642

页数：10

共 50 条

[41] Improved Sparrow Search Algorithm Based on Iterative Local Search
Yan, Shaoqiang
Yang, Ping
Zhu, Donglin
Zheng, Wanli
Wu, Fengxuan
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[42] Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
Hossain, Md Anwar
Hasan, Mirza A. F. M. Rashidul
Hossen, Ebrahim
Asraful, Md
Faruk, Md Omar
Abadin, A. F. M. Zainul
Ali, Md Suhag
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1110 - 1117
[43] Towards Unified Deep Learning Model for NSFW Image and Video Captioning
Ko, Jong-Won
Hwang, Dong-Hyun
ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING, MUE/FUTURETECH 2018, 2019, 518 : 57 - 63
[44] Multistrategy Improved Sparrow Search Algorithm Optimized Deep Neural Network for Esophageal Cancer
Wang, Yanfeng
Liu, Qing
Sun, Junwei
Wang, Lidong
Song, Xin
Zhao, Xueke
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[45] Automated OCT angiography image quality assessment using a deep learning algorithm
J. L. Lauermann
M. Treder
M. Alnawaiseh
C. R. Clemens
N. Eter
F. Alten
Graefe's Archive for Clinical and Experimental Ophthalmology, 2019, 257 : 1641 - 1648
[46] Automated OCT angiography image quality assessment using a deep learning algorithm
Lauermann, J. L.
Treder, M.
Alnawaiseh, M.
Clemens, C. R.
Eter, N.
Alten, F.
GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2019, 257 (08) : 1641 - 1648
[47] Prediction of software defects using deep learning with improved cuckoo search algorithm
Badvath, Dhanalaxmi
Miriyala, Aruna Safali
Gunupudi, Sai Chaitanya Kumar
Kuricheti, Parish Venkata Kumar
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (26):
[48] Deep Learning Approaches on Image Captioning: A Review
Ghandi, Taraneh
Pourreza, Hamidreza
Mahyar, Hamidreza
ACM COMPUTING SURVEYS, 2024, 56 (03)
[49] A Comprehensive Survey of Deep Learning for Image Captioning
Hossain, Md Zakir
Sohel, Ferdous
Shiratuddin, Mohd Fairuz
Laga, Hamid
ACM COMPUTING SURVEYS, 2019, 51 (06)
[50] Facilitated Deep Learning Models for Image Captioning
Azhar, Imtinan
Afyouni, Imad
Elnagar, Ashraf
2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,

← 1 2 3 4 5 →