Automatic image captioning system using a deep learning approach

被引:1
|
作者
Deepak, Gerard [1 ]
Gali, Sowmya [2 ]
Sonker, Abhilash [3 ]
Jos, Bobin Cherian [4 ]
Sagar, K. V. Daya [5 ]
Singh, Charanjeet [6 ]
机构
[1] Manipal Acad Higher Educ, Manipal Inst Technol Bengaluru, Dept Comp Sci & Engn, Manipal, India
[2] Santhiram Engn Coll, Dept Elect & Commun Engn, Nandyal, Andhra Pradesh, India
[3] MITS Gwalior, Dept Informat Technol, Gwalior, Madhya Pradesh, India
[4] Mar Athanasius Coll Engn, Dept Mech Engn, Kothamangalam, India
[5] Koneru Lakshmaiah Educ Fdn, Elect & Comp Engn, Guntur, Andhra Pradesh, India
[6] Deenbandhu Chhotu Ram Univ Sci & Technol, Dept Elect & Commun, Murthal, Haryana, India
来源
关键词
Image captioning; Deep learning; Generative adversarial network; Residual learning; ATTENTION; NETWORK; MEDIA; TEXT;
D O I
10.1007/s00500-023-08544-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper's residual network is tailored to increase the high-quality image caption generation ability. The captioning is exploited using the relevant content with high-quality interpretation. The research develops a Residual Attention Generative Adversarial Network (RAGAN) and uses attention-based residual learning in Generative Adversarial Network (GAN) to improve the diversity and fidelity of the generated image captions. The RAGAN exploits the words based on the feature maps faster to generate high-quality captions. The RAGAN improves the diversity of captions generated and increases the language metrics scores. The generator is designed as an encoder-decoder mechanism that operates in an unsupervised manner. The residual learning is adopted between the encoder and decoder network. The discriminator is connected to a language evaluator unit, which provides feed-forward to the generator and discriminator to either positively or negatively influence the image captioning process. The experiments show that the proposed RAGAN performs better than the state-of-the-art GAN models.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Automatic image captioning in Thai for house defect using a deep learning-based approach
    Manadda Jaruschaimongkol
    Krittin Satirapiwong
    Kittipan Pipatsattayanuwong
    Suwant Temviriyakul
    Ratchanat Sangprasert
    Thitirat Siriborvornratanakul
    [J]. Advances in Computational Intelligence, 2024, 4 (1):
  • [2] Image Captioning Using Deep Learning
    Adithya, Paluvayi Veera
    Kalidindi, Mourya Viswanadh
    Swaroop, Nallani Jyothi
    Vishwas, H. N.
    [J]. ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT III, 2024, 2092 : 42 - 58
  • [3] Image Captioning using Deep Learning
    Jain, Yukti Sanjay
    Dhopeshwar, Tanisha
    Chadha, Supreet Kaur
    Pagire, Vrushali
    [J]. 2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021,
  • [4] Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
    Hossain, Md Anwar
    Hasan, Mirza A. F. M. Rashidul
    Hossen, Ebrahim
    Asraful, Md
    Faruk, Md Omar
    Abadin, A. F. M. Zainul
    Ali, Md Suhag
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1110 - 1117
  • [5] Image and Video Captioning for Apparels Using Deep Learning
    Agarwal, Govind
    Jindal, Kritika
    Chowdhury, Abishi
    Singh, Vishal K.
    Pal, Amrit
    [J]. IEEE ACCESS, 2024, 12 : 113138 - 113150
  • [6] Generative image captioning in Urdu using deep learning
    Afzal M.K.
    Shardlow M.
    Tuarob S.
    Zaman F.
    Sarwar R.
    Ali M.
    Aljohani N.R.
    Lytras M.D.
    Nawaz R.
    Hassan S.-U.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (6) : 7719 - 7731
  • [7] Deep Learning for Military Image Captioning
    Das, Subrata
    Jain, Lalit
    Das, Amp
    [J]. 2018 21ST INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2018, : 2165 - 2171
  • [8] Vocabulary Learning Support System Based on Automatic Image Captioning Technology
    Hasnine, Mohammad Nehal
    Flanagan, Brendan
    Akcapinar, Gokhan
    Ogata, Hiroaki
    Mouri, Kousuke
    Uosaki, Noriko
    [J]. DISTRIBUTED, AMBIENT AND PERVASIVE INTERACTIONS, 2019, 11587 : 346 - 358
  • [9] Image Captioning using Deep Learning: A Systematic Literature Review
    Chohan, Murk
    Khan, Adil
    Mahar, Muhammad Saleem
    Hassan, Saif
    Ghafoor, Abdul
    Khan, Mehmood
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (05) : 278 - 286
  • [10] Metaheuristics Optimization with Deep Learning Enabled Automated Image Captioning System
    Al Duhayyim, Mesfer
    Alazwari, Sana
    Mengash, Hanan Abdullah
    Marzouk, Radwa
    Alzahrani, Jaber S.
    Mahgoub, Hany
    Althukair, Fahd
    Salama, Ahmed S.
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (15):