Infrared Image Caption Based on Object-Oriented Attention

被引:2
|
作者
Lv, Junfeng [1 ]
Hui, Tian [1 ]
Zhi, Yongfeng [1 ]
Xu, Yuelei [1 ]
机构
[1] Northwestern Polytech Univ, Inst Unmanned Syst Res, Xian 710072, Peoples R China
关键词
infrared image caption; domain transfer object detection; adaptive weighting module; object oriented attention;
D O I
10.3390/e25050826
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
With the ongoing development of image technology, the deployment of various intelligent applications on embedded devices has attracted increased attention in the industry. One such application is automatic image captioning for infrared images, which involves converting images into text. This practical task is widely used in night security, as well as for understanding night scenes and other scenarios. However, due to the differences in image features and the complexity of semantic information, generating captions for infrared images remains a challenging task. From the perspective of deployment and application, to improve the correlation between descriptions and objects, we introduced the YOLOv6 and LSTM as encoder-decoder structure and proposed infrared image caption based on object-oriented attention. Firstly, to improve the domain adaptability of the detector, we optimized the pseudo-label learning process. Secondly, we proposed the object-oriented attention method to address the alignment problem between complex semantic information and embedded words. This method helps select the most crucial features of the object region and guides the caption model in generating words that are more relevant to the object. Our methods have shown good performance on the infrared image and can produce words explicitly associated with the object regions located by the detector. The robustness and effectiveness of the proposed methods were demonstrated through evaluation on various datasets, along with other state-of-the-art methods. Our approach achieved BLUE-4 scores of 31.6 and 41.2 on KAIST and Infrared City and Town datasets, respectively. Our approach provides a feasible solution for the deployment of embedded devices in industrial applications.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] An image fusion method based on object-oriented image classification
    Chen, YH
    Fung, T
    Lin, WJ
    Wang, JF
    IGARSS 2005: IEEE International Geoscience and Remote Sensing Symposium, Vols 1-8, Proceedings, 2005, : 3924 - 3927
  • [2] An image fusion method based on object-oriented classification
    Jing, Linhai
    Cheng, Qiuming
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2012, 33 (08) : 2434 - 2450
  • [3] PoISAR image classification based on object-oriented technology
    Xiao Yan
    Wang Bin
    JOURNAL OF INFRARED AND MILLIMETER WAVES, 2020, 39 (04) : 505 - 512
  • [4] Object-oriented building extraction based on visual attention mechanism
    Shen, Xiaole
    Yu, Chen
    Lin, Lin
    Cao, Jinzhou
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [5] Progressing from object-based to object-oriented image analysis
    Baatz, M.
    Hoffmann, C.
    Willhauck, G.
    Lecture Notes in Geoinformation and Cartography, 2008, 0 (9783540770572): : 29 - 42
  • [6] Object-Oriented Image Database model
    Grosky, WI
    Stanchev, PL
    COMPUTERS AND THEIR APPLICATIONS, 2001, : 94 - 97
  • [7] Object-oriented knowledge-based system for image diagnosis
    Chan, Samuel W.K.
    Leung, K.S.
    Wong, W.S.Felix
    1996, Taylor & Francis Ltd, London, United Kingdom (10)
  • [8] Object-oriented knowledge-based system for image diagnosis
    Chan, SWK
    Leung, KS
    Wong, WSF
    APPLIED ARTIFICIAL INTELLIGENCE, 1996, 10 (05) : 407 - 438
  • [9] Meaningful image objects for object-oriented image analysis
    Lizarazo, I.
    REMOTE SENSING LETTERS, 2013, 4 (05) : 419 - 426
  • [10] An image caption method based on object detection
    Cao, Danyang
    Zhu, Menggui
    Gao, Lei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (24) : 35329 - 35350