Head-Free Lightweight Semantic Segmentation with Linear Transformer

被引：0

作者：

Dong, Bo ^{[1
]}

Wang, Pichao ^{[1
]}

Wang, Fan ^{[1
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

来源：

THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the com-putational load introduced by the overall structure has long been ignored, which hinders their applications on resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer (AFFormer). AFFormer adopts a parallel architecture to leverage prototype representations as specific learnable local descriptions which replaces the decoder and preserves the rich image semantics on high-resolution features. Although removing the decoder compresses most of the computation, the accuracy of the parallel structure is still limited by low computational resources. Therefore, we employ heterogeneous operators (CNN and Vision Transformer) for pixel embedding and prototype representations to further save computational costs. Moreover, it is very difficult to linearize the complexity of the vision Transformer from the perspective of spatial domain. Due to the fact that semantic segmentation is very sensitive to frequency information, we construct a lightweight prototype learning block with adaptive frequency filter of complexity O(n) to replace standard self attention with O(n2). Extensive experiments on widely adopted datasets demonstrate that AFFormer achieves superior accuracy while retaining only 3M parameters. On the ADE20K dataset, AFFormer achieves 41.8 mIoU and 4.6 GFLOPs, which is 4.4 mIoU higher than Segformer, with 45% less GFLOPs. On the Cityscapes dataset, AFFormer achieves 78.7 mIoU and 34.4 GFLOPs, which is 2.5 mIoU higher than Segformer with 72.5% less GFLOPs. Code is available at https://github.com/dongbo811/AFFormer.

引用

页码：516 / 524

页数：9

共 50 条

[31] A Lightweight Road Scene Semantic Segmentation Algorithm
Peng, Jiansheng
Yang, Qing
Hou, Yaru
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 1929 - 1948
[32] Lightweight semantic segmentation network for underwater image
Guo H.-R.
Guo J.-C.
Wang Y.-D.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (07): : 1278 - 1286
[33] Lightweight semantic segmentation for digital workshop objects
Yi J.
Chen G.
Ru Q.
Li M.
Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (03): : 920 - 929
[34] LiteSeg: A Novel Lightweight ConvNet for Semantic Segmentation
Emara, Taha
Abd El Munim, Hossam E.
Abbas, Hazem M.
2019 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2019, : 113 - 119
[35] PREDICTIVE CONTROL OF HEAD AND EYE-MOVEMENTS DURING HEAD-FREE PURSUIT IN HUMANS
BARNES, GR
JOURNAL OF PHYSIOLOGY-LONDON, 1991, 438 : P215 - P215
[36] Scene sketch semantic segmentation with hierarchical Transformer
Yang, Jie
Ke, Aihua
Yu, Yaoxiang
Cai, Bo
KNOWLEDGE-BASED SYSTEMS, 2023, 280
[37] Graph Structure Guided Transformer for Semantic Segmentation
Qian, Luyang
Zhang, Canlong
Li, Zhixin
Wang, Zhiwen
2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 915 - 922
[38] A Unified Efficient Pyramid Transformer for Semantic Segmentation
Zhu, Fangrui
Zhu, Yi
Zhang, Li
Wu, Chongruo
Fu, Yanwei
Li, Mu
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2667 - 2677
[39] MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
Reza, Md Kaykobad
Prater-Bennette, Ashley
Asif, M. Salman
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 599 - 610
[40] CoT: Contourlet Transformer for Hierarchical Semantic Segmentation
Shao, Yilin
Sun, Long
Jiao, Licheng
Liu, Xu
Liu, Fang
Li, Lingling
Yang, Shuyuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15

← 1 2 3 4 5 →