Automatic image captioning combining natural language processing and deep neural networks

被引:8
|
作者
Rinaldi, Antonio M. [1 ]
Russo, Cristiano [1 ]
Tommasino, Cristian [1 ]
机构
[1] Univ Naples Federico II, Dept Elect Engn & Informat Technol, IKNOS LAB Intelligent & Knowledge Syst LUPT, Via Claudio 21, I-80125 Naples, Italy
关键词
Object detection; Image captioning; Deep neural networks; Semantic-instance segmentation;
D O I
10.1016/j.rineng.2023.101107
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An image contains a lot of information that humans can detect in a very short time. Image captioning aims to detect this information by describing the image content through image and text processing techniques. One of the peculiarities of the proposed approach is the combination of multiple networks to catch as many distinct features as possible from a semantic point of view. In this work, our goal is to prove that a combination strategy of existing methods can efficiently improve the performance in the object detection tasks concerning the performance achieved by each tested individually. This approach involves using different deep neural networks that perform two levels of hierarchical object detection in an image. The results are combined and used by a captioning module that generates image captions through natural language processing techniques. Several experimental results are reported and discussed to show the effectiveness of our framework. The combination strategy has also improved, showing a gain in precision over single models.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Graph Neural Networks for Natural Language Processing: A Survey
    Wu, Lingfei
    Chen, Yu
    Shen, Kai
    Guo, Xiaojie
    Gao, Hanning
    Li, Shucheng
    Pei, Jian
    Long, Bo
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (02): : 119 - 329
  • [22] Image Captioning with Compositional Neural Module Networks
    Tian, Junjiao
    Oh, Jean
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3576 - 3584
  • [23] Survey of convolutional neural networks for image captioning
    Kalra, Saloni
    Leekha, Alka
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (01): : 239 - 260
  • [24] Deep generative neural networks for spectral image processing
    Mishra, Puneet
    ANALYTICA CHIMICA ACTA, 2022, 1191
  • [25] On the use of deep feedforward neural networks for automatic language identification
    Lopez-Moreno, Ignacio
    Gonzalez-Dominguez, Javier
    Martinez, David
    Plchot, Oldrich
    Gonzalez-Rodriguez, Joaquin
    Moreno, Pedro J.
    COMPUTER SPEECH AND LANGUAGE, 2016, 40 : 46 - 59
  • [26] A novel deep fuzzy neural network semantic-enhanced method for automatic image captioning
    Vo, Tham
    SOFT COMPUTING, 2023, 27 (20) : 14647 - 14658
  • [27] Enhancing Descriptive Image Captioning with Natural Language Inference
    Shi, Zhan
    Liu, Hui
    Zhu, Xiaodan
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 269 - 277
  • [28] A novel deep fuzzy neural network semantic-enhanced method for automatic image captioning
    Tham Vo
    Soft Computing, 2023, 27 : 14647 - 14658
  • [29] Empirical evaluation of multi-task learning in deep neural networks for natural language processing
    Jianquan Li
    Xiaokang Liu
    Wenpeng Yin
    Min Yang
    Liqun Ma
    Yaohong Jin
    Neural Computing and Applications, 2021, 33 : 4417 - 4428
  • [30] Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation
    Chen, Pei-Fu
    Chen, Lichin
    Lin, Yow-Kuan
    Li, Guo-Hung
    Lai, Feipei
    Lu, Cheng-Wei
    Yang, Chi-Yu
    Chen, Kuan-Chih
    Lin, Tzu-Yu
    JMIR MEDICAL INFORMATICS, 2022, 10 (05)