Cross-modal collaborative representation and multi-level supervision for crowd counting

被引:4
|
作者
Li, Shufang [1 ,2 ]
Hu, Zhengping [1 ]
Zhao, Mengyao [1 ]
Bi, Shuai [1 ]
Sun, Zhe [1 ]
机构
[1] Yanshan Univ, Sch Informat Sci & Engn, West Hebei St 438, Qinhuangdao 066004, Hebei, Peoples R China
[2] Hebei Univ Environm Engn, Dept Informat Engn, Jingang Rd 8, Qinhuangdao 066102, Hebei, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Crowd counting; Cross-modal collaborative representation learning; Multi-level supervision;
D O I
10.1007/s11760-022-02266-4
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Crowd features are often extracted from RGB images to complete the tasks of density estimation and crowd counting. However, RGB images will be affected in some particularly poor illumination, resulting in the inability to accurately identify semantic objects, and thermal images can help solve this problem. Considering the comprehensive utilization of optical and thermal imaging information, we propose a crowd counting method based on cross-modal coordinated representation and multi-level supervision. In order to capture the complementary features of different modalities, RGB and thermal images are used as specific steams of cross-modal cooperative learning. The missing specific information is compensated and the shared information is enhanced; both are through the aggregation and distribution calculation of specific steams and shared steam. Furthermore, in order to weaken the influence of the background and strengthen the identification of crowd regions, we combine the multi-scale crowd feature extraction and region recognition. Multiple output layers are added in the propagation process of multi-modal streams, so as to achieve the purpose of multi-level supervision. Moreover, we replace the baseline training loss with the Bayesian loss for monitoring the counting expectation of each annotation point. Finally, comprehensive experiments on the RGBT-CC benchmark show the effectiveness of the proposed method.
引用
收藏
页码:601 / 608
页数:8
相关论文
共 50 条
  • [1] Cross-modal collaborative representation and multi-level supervision for crowd counting
    Shufang Li
    Zhengping Hu
    Mengyao Zhao
    Shuai Bi
    Zhe Sun
    [J]. Signal, Image and Video Processing, 2023, 17 : 601 - 608
  • [2] Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting
    Liu, Lingbo
    Chen, Jiaqi
    Wu, Hefeng
    Li, Guanbin
    Li, Chenglong
    Lin, Liang
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4821 - 4831
  • [3] A cross-modal crowd counting method combining CNN and cross-modal transformer
    Zhang, Shihui
    Wang, Wei
    Zhao, Weibo
    Wang, Lei
    Li, Qunpeng
    [J]. IMAGE AND VISION COMPUTING, 2023, 129
  • [4] Multi-level adversarial attention cross-modal hashing
    Wang, Benhui
    Zhang, Huaxiang
    Zhu, Lei
    Nie, Liqiang
    Liu, Li
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 117
  • [5] Multi-Level Cross-Modal Alignment for Image Clustering
    Qiu, Liping
    Zhang, Qin
    Chen, Xiaojun
    Cai, Shaotian
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14695 - 14703
  • [6] CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting
    Liu, Yanbo
    Cao, Guo
    Shi, Boshan
    Hu, Yingxiang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 154 - 165
  • [7] Learning the cross-modal discriminative feature representation for RGB-T crowd counting
    Li, He
    Zhang, Shihui
    Kong, Weihang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 257
  • [8] Deep Multi-Level Semantic Hashing for Cross-Modal Retrieval
    Ji, Zhenyan
    Yao, Weina
    Wei, Wei
    Song, Houbing
    Pi, Huaiyu
    [J]. IEEE ACCESS, 2019, 7 : 23667 - 23674
  • [9] Multi-Level Correlation Adversarial Hashing for Cross-Modal Retrieval
    Ma, Xinhong
    Zhang, Tianzhu
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3101 - 3114
  • [10] Cross-Modal Information Aggregation and Distribution Method for Crowd Counting
    Chen, Yin
    Zhou, Yuhao
    Dong, Tianyang
    [J]. ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT IV, 2024, 14498 : 106 - 119