TRCDNet: A Transformer Network for Video Cloud Detection

被引:2
|
作者
Luo, Chen [1 ,2 ]
Feng, Shanshan [1 ,2 ]
Quan, Yingling [3 ]
Ye, Yunming [1 ,2 ]
Li, Xutao [1 ,2 ]
Xu, Yong [3 ]
Zhang, Baoquan [3 ]
Chen, Zhihao [3 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci, Shenzhen 518055, Peoples R China
[2] Harbin Inst Technol, Shenzhen Key Lab Internet Informat Collaborat, Shenzhen, Peoples R China
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen 518071, Peoples R China
关键词
Cloud detection on geostationary satellite images; Fengyun-4A satellites; video cloud detection; SHADOW DETECTION; ALGORITHMS; FEATURES; IMAGERY; FUSION;
D O I
10.1109/TGRS.2023.3288543
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In remote-sensing image (RSI) preprocessing steps, detecting and removing cloudy areas is a critical task. Recently, cloud detection methods based on deep neural networks achieve outstanding performance over traditional methods. Current approaches mostly focus on cloud detection on a single image captured by polar-orbiting satellites. However, there is another type of meteorological satellite-geostationary satellite, which can capture temporal consecutive frames of a particular location. Therefore, the cloud detection task targeting a geostationary satellite can be treated as a video cloud detection task. And in addition to extracting features on a single image, extracting and making full use of the relations between sequential frames is also important. To tackle this problem, we design a deep-learning video cloud detection model: transformer network for video cloud detection (TRCDNet). The proposed network is based on the encoder-decoder structure. In the encoder, the module ContextGhostLayer is proposed to encode more semantic information to tackle challenging problems like thin clouds in RSIs. Besides, we design a transformer-based video sequence transformer (VSTR) block. Based on the attention mechanism, VSTR can fully extract the across-frame relations. In the proposed decoder, the cloud masks are recovered gradually to the same scale as the input image. To evaluate the methods, we create a Video Cloud Detection dataset based on the captured videos from Fengyun 4 (FY-4) satellite: Fengyun4aCloud. Extensive experiments of current cloud detection methods, semantic segmentation methods, and video semantic segmentation (VSS) methods indicate that the designed TRCDNet achieves state-of-art performance in video cloud detection.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Fire Detection using Transformer Network
    Shahid, Mohammad
    Hua, Kai-lung
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 627 - 630
  • [42] Hybrid Transformer Network for Deepfake Detection
    Khan, Sohail Ahmed
    Dang-Nguyen, Duc-Tien
    19TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2022, 2022, : 8 - 14
  • [43] CNN-TransNet: A Hybrid CNN-Transformer Network With Differential Feature Enhancement for Cloud Detection
    Ma, Nan
    Sun, Lin
    He, Yawen
    Zhou, Chenghu
    Dong, Chuanxiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [44] A Hybrid Algorithm with Swin Transformer and Convolution for Cloud Detection
    Gong, Chengjuan
    Long, Tengfei
    Yin, Ranyu
    Jiao, Weili
    Wang, Guizhou
    REMOTE SENSING, 2023, 15 (21)
  • [45] Object Detection of Occlusion Point Cloud based on Transformer
    Zhou, Jing
    Zhou, Jian
    Lin, Teng Xing
    Gong, Zi Xin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [46] A Lightweight Cloud and Cloud Shadow Detection Transformer With Prior-Knowledge Guidance
    Fan, Shumin
    Song, Tianyu
    Jin, Guiyue
    Jin, Jiyu
    Li, Qing
    Xia, Xinghui
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [47] ConTrans-Detect: A Multi-Scale Convolution-Transformer Network for DeepFake Video Detection
    Sun, Weirong
    Ma, Yujun
    Zhang, Hong
    Wang, Ruili
    2023 29TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE, M2VIP 2023, 2023,
  • [48] SCOTCH and SODA: A Transformer Video Shadow Detection Framework
    Liu, Lihao
    Prost, Jean
    Zhu, Lei
    Papadakis, Nicolas
    Lio, Pietro
    Schonlieb, Carola-Bibiane
    Aviles-Rivero, Angelica I.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10449 - 10458
  • [49] Learning Video Localization on Segment-Level Video Copy Detection with Transformer
    Zhang, Chi
    Liu, Jie
    Zhang, Shuwu
    Zeng, Zhi
    Huang, Ying
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 439 - 450
  • [50] Video Sparse Transformer With Attention-Guided Memory for Video Object Detection
    Fujitake, Masato
    Sugimoto, Akihiro
    IEEE ACCESS, 2022, 10 : 65886 - 65900