Self-attention Guidance Based Crowd Localization and Counting

被引:1
|
作者
Ma, Zhouzhou [1 ,2 ]
Gu, Guanghua [1 ,2 ]
Zhao, Wenrui [1 ,2 ]
机构
[1] Yanshan Univ, Sch Informat Sci & Engn, Qinhuangdao 066000, Peoples R China
[2] Hebei Key Lab Informat Transmiss & Signal Proc, Qinhuangdao 066000, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd localization; crowd counting; transformer; point supervision; object detection; IMAGE; NETWORK;
D O I
10.1007/s11633-023-1428-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing studies on crowd analysis are limited to the level of counting, which cannot provide the exact location of individuals. This paper proposes a self-attention guidance based crowd localization and counting network (SA-CLCN), which can simultaneously locate and count crowds. We take the form of object detection, using the original point annotations of crowd datasets as supervision to train the network. Ultimately, the center point coordinate of each head as well as the number of crowds are predicted. Specifically, to cope with the spatial and positional variations of the crowd, the proposed method introduces transformer to construct a globallocal feature extractor (GLFE) together with the convolutional structure. It establishes the near-to-far dependency between elements so that the global context and local detail features of the crowd image can be extracted simultaneously. Then, this paper designs a pyramid feature fusion module (PFFM) to fuse the global and local information from high level to low level to obtain a multiscale feature representation. In downstream tasks, this paper predicts candidate point offsets and confidence scores by a simple regression header and classification header. In addition, the Hungarian algorithm is used to match the predicted point set and the labelled point set to facilitate the calculation of losses. The proposed network avoids the errors or higher costs associated with using traditional density maps or bounding box annotations. Importantly, we have conducted extensive experiments on several crowd datasets, and the proposed method has produced competitive results in both counting and localization.
引用
收藏
页码:966 / 982
页数:17
相关论文
共 50 条
  • [1] Crowd counting method based on the self-attention residual network
    Liu, Yan-Bo
    Jia, Rui-Sheng
    Liu, Qing-Ming
    Zhang, Xing-Li
    Sun, Hong-Mei
    APPLIED INTELLIGENCE, 2021, 51 (01) : 427 - 440
  • [2] Crowd Counting Network with Self-attention Distillation
    Li, Yaoyao
    Wang, Li
    Zhao, Huailin
    Nie, Zhen
    JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2020, 7 (02): : 116 - 120
  • [3] Crowd Counting Network with Self-attention Distillation
    Wang, Li
    Zhao, Huailin
    Nie, Zhen
    Li, Yaoyao
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS (ICAROB2020), 2020, : 587 - 591
  • [4] Crowd counting method based on the self-attention residual network
    Yan-Bo Liu
    Rui-Sheng Jia
    Qing-Ming Liu
    Xing-Li Zhang
    Hong-Mei Sun
    Applied Intelligence, 2021, 51 : 427 - 440
  • [5] Dual-branch crowd counting algorithm based on self-attention mechanism
    Yang T.-L.
    Li L.-X.
    Zhang W.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (10): : 1955 - 1965
  • [6] Double Recursive Sparse Self-attention Based Crowd Counting in the Cluttered Background
    Zhou, Boxiang
    Wang, Suyu
    Xiao, Sai
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 722 - 734
  • [7] Dual-branch counting method for dense crowd based on self-attention mechanism
    Wang, Yongjie
    Wang, Feng
    Huang, Dongyang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 236
  • [8] HTNet: A Hybrid Model Boosted by Triple Self-attention for Crowd Counting
    Li, Yang
    Yin, Baoqun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 290 - 301
  • [9] MSGSA: Multi-Scale Guided Self-Attention Network for Crowd Counting
    Sun, Yange
    Li, Meng
    Guo, Huaping
    Zhang, Li
    ELECTRONICS, 2023, 12 (12)
  • [10] Crowd counting using a self-attention multi-scale cascaded network
    Li, He
    Zhang, Shihui
    Kong, Weihang
    IET COMPUTER VISION, 2019, 13 (06) : 556 - 561