Captioning the Images: A Deep Analysis

被引:0
|
作者
Chaudhari, Chaitrali P. [1 ]
Devane, Satish [2 ]
机构
[1] Lokmanya Tilak Coll Engn, Navi Mumbai, India
[2] Datta Meghe Coll Engn, Navi Mumbai, India
关键词
Image captioning; Natural language processing; Computer vision; REPRESENTATION;
D O I
10.1007/978-981-13-1513-8_100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is one of the fundamental tasks in machine learning since the ability to generate text captions of an image can have a great impact by assisting us in day-to-day life. However, it is not just an object classification or recognition task, because the model must know the dependencies among the recognized objects and their attributes and encode that knowledge correctly in the caption using a natural language like English. Recently, the internet is overwhelmed with the huge amount of textual and visual data consisting of billions of unstructured images and videos. Meaningful captions will serve as useful keys for retrieval, creative searching, and powerful browsing of these images. In this paper, we present the goal of analysis and classification of the recent state-of-the-art in image captioning and discuss significant differences among them. We provide a comparative review of existing models, techniques with their advantages and disadvantages. Future directions in the field of automatic image caption generation are also explored.
引用
收藏
页码:987 / 999
页数:13
相关论文
共 50 条
  • [1] Scene captioning with deep fusion of images and point clouds
    Yu, Qiang
    Zhang, Chunxia
    Weng, Lubin
    Xiang, Shiming
    Pan, Chunhong
    [J]. PATTERN RECOGNITION LETTERS, 2022, 158 : 9 - 15
  • [2] Arabic Captioning for Images of Clothing Using Deep Learning
    Al-Malki, Rasha Saleh
    Al-Aama, Arwa Yousuf
    [J]. SENSORS, 2023, 23 (08)
  • [3] Vision to Language: Captioning Images using Deep Learning
    Charu, Shreyasi
    Mishra, S. P.
    Gandhi, Tapan
    [J]. 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2020,
  • [4] Advanced Generative Deep Learning Techniques for Accurate Captioning of Images
    Chandar, J. Navin
    Kavitha, G.
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2024,
  • [5] Deep Learning for automatically describing images in natural language - Image Captioning
    Hotaran, Anca Mihaela
    Vrejoiu, Mihnea Horia
    [J]. ROMANIAN JOURNAL OF INFORMATION TECHNOLOGY AND AUTOMATIC CONTROL-REVISTA ROMANA DE INFORMATICA SI AUTOMATICA, 2020, 30 (01): : 87 - 100
  • [6] Captioning Ultrasound Images Automatically
    Alsharid, Mohammad
    Sharma, Harshita
    Drukker, Lior
    Chatelain, Pierre
    Papageorghiou, Aris T.
    Noble, J. Alison
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 338 - 346
  • [7] Captioning Images with Diverse Objects
    Venugopalan, Subhashini
    Mooney, Raymond
    Hendricks, Lisa Anne
    Darrell, Trevor
    Rohrbach, Marcus
    Saenko, Kate
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1170 - 1178
  • [8] Image and audio caps: automated captioning of background sounds and images using deep learning
    M. Poongodi
    Mounir Hamdi
    Huihui Wang
    [J]. Multimedia Systems, 2023, 29 : 2951 - 2959
  • [9] Image and audio caps: automated captioning of background sounds and images using deep learning
    Poongodi, M.
    Hamdi, Mounir
    Wang, Huihui
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (05) : 2951 - 2959
  • [10] Deep Image Captioning: An Overview
    Hrga, I.
    Ivasic-Kos, M.
    [J]. 2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 995 - 1000