Incorporating object counts into remote sensing image captioning

被引:0
|
作者
Ni, Zihao [1 ]
Zong, Zhaoyun [2 ]
Ren, Peng [1 ]
机构
[1] China Univ Petr East China, Coll Oceanog & Space Informat, Qingdao 266580, Peoples R China
[2] China Univ Petr East China, Natl Key Lab Deep Oil & Gas, Qingdao, Peoples R China
关键词
Remote sensing; earth observation; artificial intelligence; image processing; NETWORK;
D O I
10.1080/17538947.2024.2392847
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Existing methods for remote sensing image captioning tend to describe a remote sensing image using generic language that lacks specific information about object counts. To address this limitation, we propose a novel framework for generating a caption that includes object count information for the remote sensing image. Our proposed framework comprises three modules: object counting, preliminary captioning, and numeral editing. The object counting module identifies objects in a remote sensing image and determines object counts. The preliminary captioning module generates a caption that may lack object count information. The numeral editing module incorporates the object counts into the caption, resulting in a more precise caption. Our proposed framework outperforms existing methods, as demonstrated through evaluations on three remote sensing image datasets. Our proposed framework is a significant step toward more precise and informative remote sensing image captioning.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A NEW CNN-RNN FRAMEWORK FOR REMOTE SENSING IMAGE CAPTIONING
    Hoxha, Genc
    Melgani, Farid
    Slaghenauffi, Jacopo
    [J]. 2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2020, : 1 - 4
  • [32] A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning
    Sun, Dongwei
    Bao, Yajie
    Liu, Junmin
    Cao, Xiangyong
    [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17 : 18727 - 18738
  • [33] Change Captioning: A New Paradigm for Multitemporal Remote Sensing Image Analysis
    Hoxha, Genc
    Chouaf, Seloua
    Melgani, Farid
    Smara, Youcef
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] A Multiscale Grouping Transformer With CLIP Latents for Remote Sensing Image Captioning
    Meng, Lingwu
    Wang, Jing
    Meng, Ran
    Yang, Yang
    Xiao, Liang
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [35] Remote Sensing Image Captioning With Sequential Attention and Flexible Word Correlation
    Wang, Jie
    Wang, Binze
    Xi, Jiangbo
    Bai, Xue
    Ersoy, Okan K.
    Cong, Ming
    Gao, Siyan
    Zhao, Zhe
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [36] Improving Remote Sensing Image Captioning by Combining Grid Features and Transformer
    Zhuang, Shuo
    Wang, Ping
    Wang, Gang
    Wang, Di
    Chen, Jinyong
    Gao, Feng
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [37] From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning
    Du, Runyan
    Cao, Wei
    Zhang, Wenkai
    Zhi, Guo
    Sun, Xian
    Li, Shuoke
    Li, Jihao
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 7704 - 7717
  • [38] MULTI-SCALE CROPPING MECHANISM FOR REMOTE SENSING IMAGE CAPTIONING
    Zhang, Xueting
    Wang, Qi
    Chen, Shangdong
    Li, Xuelong
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 10039 - 10042
  • [39] A Novel SVM-Based Decoder for Remote Sensing Image Captioning
    Hoxha, Genc
    Melgani, Farid
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [40] Visual Rotated Position Encoding Transformer for Remote Sensing Image Captioning
    Liu, Anli
    Meng, Lingwu
    Xiao, Liang
    [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17 : 20026 - 20040