A Hindi Image Caption Generation Framework Using Deep Learning

被引:14
|
作者
Mishra, Santosh Kumar [1 ]
Dhir, Rijul [1 ]
Saha, Sriparna [1 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol, Patna, Bihar, India
关键词
Image captioning; hindi; deep-learning; attention;
D O I
10.1145/3432246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is the process of generating a textual description of an image that aims to describe the salient parts of the given image. It is an important problem, as it involves computer vision and natural language processing, where computer vision is used for understanding images, and natural language processing is used for language modeling. A lot of works have been done for image captioning for the English language. In this article, we have developed a model for image captioning in the Hindi language. Hindi is the official language of India, and it is the fourth most spoken language in the world, spoken in India and South Asia. To the best of our knowledge, this is the first attempt to generate image captions in the Hindi language. A dataset is manually created by translating well known MSCOCO dataset from English to Hindi. Finally, different types of attention-based architectures are developed for image captioning in the Hindi language. These attention mechanisms are new for the Hindi language, as those have never been used for the Hindi language. The obtained results of the proposed model are compared with several baselines in terms of BLEU scores, and the results show that our model performs better than others. Manual evaluation of the obtained captions in terms of adequacy and fluency also reveals the effectiveness of our proposed approach.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A Deep Attention based Framework for Image Caption Generation in Hindi Language
    Dhir, Rijul
    Mishra, Santosh Kumar
    Saha, Sriparna
    Bhattacharyya, Pushpak
    [J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 693 - 701
  • [2] Automatic image caption generation using deep learning
    Verma, Akash
    Yadav, Arun Kumar
    Kumar, Mohit
    Yadav, Divakar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 5309 - 5325
  • [3] Image Caption Generation using Deep Learning Technique
    Amritkar, Chetan
    Jabade, Vaishali
    [J]. 2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [4] Automatic image caption generation using deep learning
    Akash Verma
    Arun Kumar Yadav
    Mohit Kumar
    Divakar Yadav
    [J]. Multimedia Tools and Applications, 2024, 83 : 5309 - 5325
  • [5] An encoder-decoder based framework for hindi image caption generation
    Alok Singh
    Thoudam Doren Singh
    Sivaji Bandyopadhyay
    [J]. Multimedia Tools and Applications, 2021, 80 : 35721 - 35740
  • [6] An encoder-decoder based framework for hindi image caption generation
    Singh, Alok
    Singh, Thoudam Doren
    Bandyopadhyay, Sivaji
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 35721 - 35740
  • [7] Automatic image caption generation using deep learning and multimodal attention
    Dai, Jin
    Zhang, Xinyu
    [J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [8] Image Caption Generation using Deep Learning For Video Summarization Applications
    Inayathulla, Mohammed
    Karthikeyan, C.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 565 - 572
  • [9] Image Caption Generation Using A Deep Architecture
    Hani, Ansar
    Tagougui, Najiba
    Kherallah, Monji
    [J]. 2019 INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2019, : 246 - 251
  • [10] Sentence Learning on Deep Convolutional Networks for Image Caption Generation
    Kim, Dong-Jin
    Yoo, Donggeun
    Sim, Bonggeun
    Kweon, In So
    [J]. 2016 13TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2016, : 246 - 247