CCST: crowd counting with swin transformer

被引:22
|
作者
Li, Bo [1 ]
Zhang, Yong [1 ]
Xu, Haihui [2 ]
Yin, Baocai [1 ]
机构
[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Dept Informat Sci, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[2] Beijing Municipal Transportat Operat Coordinat Ct, Beijing 100161, Peoples R China
来源
VISUAL COMPUTER | 2023年 / 39卷 / 07期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Crowd counting; Transformer; Uneven distribution of crowd density; Large span of head size; Feature adaptive fusion; IMAGE;
D O I
10.1007/s00371-022-02485-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Accurately estimating the number of individuals contained in an image is the purpose of the crowd counting. It has always faced two major difficulties: uneven distribution of crowd density and large span of head size. Focusing on the former, most CNN-based methods divide the image into multiple patches for processing, ignoring the connection between the patches. For the latter, the multi-scale feature fusion method using feature pyramid ignores the matching relationship between the head size and the hierarchical features. In response to the above issues, we propose a crowd counting network named CCST based on swin transformer, and tailor a feature adaptive fusion regression head called FAFHead. Swin transformer can fully exchange information within and between patches, and effectively alleviate the problem of uneven distribution of crowd density. FAFHead can adaptively fuse multi-level features, improve the matching relationship between head size and feature pyramid hierarchy, and relief the problem of large span of head size available. Experimental results on common datasets show that CCST has better counting performance than all weakly supervised counting works and great majority of popular density map-based fully supervised works.
引用
收藏
页码:2671 / 2682
页数:12
相关论文
共 50 条
  • [1] CCST: crowd counting with swin transformer
    Bo Li
    Yong Zhang
    Haihui Xu
    Baocai Yin
    The Visual Computer, 2023, 39 : 2671 - 2682
  • [2] Weakly supervised crowd counting based on Swin Transformer
    Feng, Min
    Hao, Linlin
    Kuang, Yonggang
    2023 THE 6TH INTERNATIONAL CONFERENCE ON ROBOT SYSTEMS AND APPLICATIONS, ICRSA 2023, 2023, : 229 - 236
  • [3] Crowd counting in congested scene by CNN and Transformer Crowd counting for converged networks
    Lin, Yuanyuan
    Yang, Huicheng
    Hu, Yaocong
    Shuai, Zhen
    Li, Wenting
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 1092 - 1095
  • [4] Congested crowd instance localization with dilated convolutional swin transformer
    Gao, Junyu
    Gong, Maoguo
    Li, Xuelong
    NEUROCOMPUTING, 2022, 513 : 94 - 103
  • [5] Crowd counting via Localization Guided Transformer
    Yuan, Lixian
    Chen, Yandong
    Wu, Hefeng
    Wan, Wentao
    Chen, Pei
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
  • [6] Crowd behavior detection: leveraging video swin transformer for crowd size and violence level analysis
    Qaraqe, Marwa
    Yang, Yin David
    Varghese, Elizabeth B.
    Basaran, Emrah
    Elzein, Almiqdad
    APPLIED INTELLIGENCE, 2024, 54 (21) : 10709 - 10730
  • [7] An interactive network based on transformer for multimodal crowd counting
    Yu, Ying
    Cai, Zhen
    Miao, Duoqian
    Qian, Jin
    Tang, Hong
    APPLIED INTELLIGENCE, 2023, 53 (19) : 22602 - 22614
  • [8] An interactive network based on transformer for multimodal crowd counting
    Ying Yu
    Zhen Cai
    Duoqian Miao
    Jin Qian
    Hong Tang
    Applied Intelligence, 2023, 53 : 22602 - 22614
  • [9] Transformer-CNN hybrid network for crowd counting
    Yu J.
    Yu Y.
    Qian J.
    Han X.
    Zhu F.
    Zhu Z.
    Journal of Intelligent and Fuzzy Systems, 2024, 46 (04): : 10773 - 10785
  • [10] Audio-Visual Transformer Based Crowd Counting
    Sajid, Usman
    Chen, Xiangyu
    Sajid, Hasan
    Kim, Taejoon
    Wang, Guanghui
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2249 - 2259