Scene-level buildings damage recognition based on Cross Conv-Transformer

被引:1
|
作者
Shi, Lingfei [1 ]
Zhang, Feng [1 ,2 ,5 ]
Xia, Junshi [3 ]
Xie, Jibo [4 ]
机构
[1] Zhejiang Univ, Sch Earth Sci, Hangzhou, Peoples R China
[2] Zhejiang Prov Key Lab Geog Informat Sci, Hangzhou, Peoples R China
[3] RIKEN Ctr Adv Intelligence Project, Geoinformat Unit, Tokyo, Japan
[4] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing, Peoples R China
[5] Zhejiang Univ, Sch Earth Sci, 38 Zheda Rd, Hangzhou 310027, Peoples R China
关键词
Scene recognition; damaged buildings; aerial images; transformer;
D O I
10.1080/17538947.2023.2261770
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Different to pixel-based and object-based image recognition, a larger perspective based on the scene can improve the efficiency of assessing large-scale building damage. However, the complexity of disaster scenes and the scarcity of datasets are major challenges in identifying building damage. To address these challenges, the Cross Conv-Transformer model is proposed to classify and evaluate the degree of damage to buildings using aerial images taken after earthquake. We employ Conv-Embedding and Conv-Projection to extract features from the images. The integration of convolution and Transformer reduces the computational burden of the model while enhancing its feature extraction capabilities. Furthermore, the two branch Conv-Transformer architecture with global and local attention is designed, allowing each branch to focus on global and local features respectively. The cross-attention fusion module merges feature information from the two branches to enrich classification features. At last, we utilize aerial images captured during the Beichuan and Yushu earthquakes as both the training and test sets to assess the model. The proposed Cross Conv-Transformer model improved classification accuracy by 4.7% and 2.1% compared to the ViT and EfficientNet. The results show that the Cross Conv-Transformer model could significantly reduces misclassification between severely and moderately damaged categories.
引用
收藏
页码:3987 / 4007
页数:21
相关论文
共 50 条
  • [1] Imbalanced Conditional Conv-Transformer for Mathematical Expression Recognition
    Ji, Shuaijian
    Zhou, Zhaokun
    Wang, Yuqing
    Duan, Baishan
    Weng, Zhenyu
    Xu, Liang
    Zhu, Yuesheng
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 446 - 458
  • [2] ECTFormer: An efficient Conv-Transformer model design for image recognition
    Sa, Jaewon
    Ryu, Junhwan
    Kim, Heegon
    PATTERN RECOGNITION, 2025, 159
  • [3] Commodity classification in livestreaming marketing based on a conv-transformer network
    Rongze Zhang
    Xiuhui Wang
    Multimedia Tools and Applications, 2024, 83 : 54909 - 54924
  • [4] Commodity classification in livestreaming marketing based on a conv-transformer network
    Zhang, Rongze
    Wang, Xiuhui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54909 - 54924
  • [5] STCT: Spatial-Temporal Conv-Transformer Network for Cardiac Arrhythmias Recognition
    Qiu, Yixuan
    Chen, Weitong
    Yue, Lin
    Xu, Miao
    Zhu, Baofeng
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT I, 2022, 13087 : 86 - 100
  • [6] Conv-transformer architecture for unconstrained off-line Urdu handwriting recognition
    Nauman Riaz
    Haziq Arbab
    Arooba Maqsood
    Khuzaeymah Nasir
    Adnan Ul-Hasan
    Faisal Shafait
    International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 373 - 384
  • [7] Conv-transformer architecture for unconstrained off-line Urdu handwriting recognition
    Riaz, Nauman
    Arbab, Haziq
    Maqsood, Arooba
    Nasir, Khuzaeymah
    Ul-Hasan, Adnan
    Shafait, Faisal
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (04) : 373 - 384
  • [8] Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
    Huang, Wenyong
    Hu, Wenchao
    Yeung, Yu Ting
    Chen, Xiao
    INTERSPEECH 2020, 2020, : 5001 - 5005
  • [9] Deep Multi-Instance Conv-Transformer Frameworks for Landmark-Based Brain MRI Classification
    Li, Guannan
    Ji, Zexuan
    Sun, Quansen
    ELECTRONICS, 2024, 13 (05)
  • [10] STR Transformer: A Cross-domain Transformer for Scene Text Recognition
    Wu, Xing
    Tang, Bin
    Zhao, Ming
    Wang, Jianjia
    Guo, Yike
    APPLIED INTELLIGENCE, 2023, 53 (03) : 3444 - 3458