Context-Aggregated and SAM-Guided Network for ViT-Based Instance Segmentation in Remote Sensing Images

被引:0
|
作者
Liu, Shuangzhou [1 ,2 ,3 ]
Wang, Feng [1 ,2 ]
You, Hongjian [1 ,2 ,3 ]
Jiao, Niangang [1 ,2 ]
Zhou, Guangyao [1 ,2 ]
Zhang, Tingtao [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Key Lab Technol Geospatial Informat Proc & Applica, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[3] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 101408, Peoples R China
关键词
instance segmentation; remote sensing images; SAM; backbone;
D O I
10.3390/rs16132472
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Instance segmentation of remote sensing images can not only provide object-level positioning information but also provide pixel-level positioning information. This pixel-level information annotation has a wide range of uses in the field of remote sensing, and it is of great value for environmental detection and resource management. Because optical images generally have complex terrain environments and changeable object shapes, SAR images are affected by complex scattering phenomena, and the mask quality obtained by the traditional instance segmentation method used in remote sensing images is not high. Therefore, it is a challenging task to improve the mask quality of instance segmentation in remote sensing images. Since the traditional two-stage instance segmentation method consists of backbone, neck, bbox head, and mask head, the final mask quality depends on the product of all front-end work quality. Therefore, we consider the difficulty of optical and SAR images to bring instance segmentation to the targeted improvement of the neck, bbox head, and mask head, and we propose the Context-Aggregated and SAM-Guided Network (CSNet). In this network, the plain feature fusion pyramid network (PFFPN) can generate a pyramid for the plain feature and provide a feature map of the appropriate instance scale for detection and segmentation. The network also includes a context aggregation bbox head (CABH), which uses the context information and instance information around the instance to solve the problem of missed detection and false detection in detection. The network also has a SAM-Guided mask head (SGMH), which learns by using SAM as a teacher, and uses the knowledge learned to improve the edge of the mask. Experimental results show that CSNet significantly improves the quality of masks generated under optical and SAR images, and CSNet achieves 5.1% and 3.2% AP increments compared with other SOTA models.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Detection of Schools in Remote Sensing Images Based on Attention-Guided Dense Network
    Fu, Han
    Fan, Xiangtao
    Yan, Zhenzhen
    Du, Xiaoping
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (11)
  • [42] Semantic segmentation guided pseudo label mining and instance re-detection for weakly supervised object detection in remote sensing images
    Qian, Xiaoliang
    Li, Chao
    Wang, Wei
    Yao, Xiwen
    Cheng, Gong
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 119
  • [43] An End-To-End Bayesian Segmentation Network Based on a Generative Adversarial Network for Remote Sensing Images
    Xiong, Dehui
    He, Chu
    Liu, Xinlong
    Liao, Mingsheng
    REMOTE SENSING, 2020, 12 (02)
  • [44] MW-SAM:Mangrove wetland remote sensing image segmentation network based on segment anything model
    Zhang, Yu
    Wang, Xin
    Cai, Jingye
    Yang, Qun
    IET IMAGE PROCESSING, 2024, 18 (14) : 4503 - 4513
  • [45] Geometric Boundary Guided Feature Fusion and Spatial-Semantic Context Aggregation for Semantic Segmentation of Remote Sensing Images
    Wang, Yupei
    Zhang, Haoran
    Hu, Yongkang
    Hu, Xiaoxing
    Chen, Liang
    Hu, Shanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6373 - 6385
  • [46] Road Segmentation from High-Fidelity Remote Sensing Images using a Context Information Capture Network
    Zhu, Yuting
    Long, Lihong
    Wang, Jinjie
    Yan, Jingwen
    Wang, Xiaoqing
    COGNITIVE COMPUTATION, 2022, 14 (02) : 780 - 793
  • [47] Road Segmentation from High-Fidelity Remote Sensing Images using a Context Information Capture Network
    Yuting Zhu
    Lihong Long
    Jinjie Wang
    Jingwen Yan
    Xiaoqing Wang
    Cognitive Computation, 2022, 14 : 780 - 793
  • [48] Road Segmentation of Unmanned Aerial Vehicle Remote Sensing Images Using Adversarial Network With Multiscale Context Aggregation
    Li, Yuxia
    Peng, Bo
    He, Lei
    Fan, Kunlong
    Tong, Ling
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (07) : 2279 - 2287
  • [49] PSGCNet: A Pyramidal Scale and Global Context Guided Network for Dense Object Counting in Remote-Sensing Images
    Gao, Guangshuai
    Liu, Qingjie
    Hu, Zhenghui
    Li, Lu
    Wen, Qi
    Wang, Yunhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [50] Multielement-Feature-Based Hierarchical Context Integration Network for Remote Sensing Image Segmentation
    Yang, Yunsong
    Yuan, Genji
    Li, Jinjiang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 7971 - 7985