A novel framework for crowd counting using video and audio

被引:1
|
作者
Zou, Yi [1 ]
Min, Weidong [1 ,2 ,3 ]
Zhao, Haoyu [1 ]
Han, Qing [1 ]
机构
[1] Nanchang Univ, Sch Math & Comp Sci, Nanchang 330031, Peoples R China
[2] Nanchang Univ, Inst Metaverse, Nanchang 330031, Peoples R China
[3] Jiangxi Key Lab Smart City, Nanchang 330031, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd counting; VACCNet; Video Crowd Counting; Multiple direction audio assistance;
D O I
10.1016/j.compeleceng.2023.108754
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Crowd counting is significant in many areas. The existing methods have poor accuracy for perspective scenes and low illumination scenes. Besides, the existing audio-assisted methods only use local audio, which fails to provide the spatial feature information of sound in all directions in space. To alleviate the above problems, a novel framework named Video and Audio-assisted Crowd Counting Network (VACCNet) is proposed. The framework consists of two submodules: Video Crowd Counting (VCC) module and Audio-assisted Crowd Counting (ACC) module. The visual features from the VCC module and the fused audio features from the ACC module are further combined to get the final density map. To prove the effects of VACCNet, a new self-collected dataset named multiPle dIrection Assistance couNting netwOrk (PIANO) is built. The experimental results based on existing benchmarks and PIANO show that the proposed method has a 14.23% improvement averagely to the conventional methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] A hidden Markov model framework for video segmentation using audio and image features
    Boreczky, JS
    Wilcox, LD
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 3741 - 3744
  • [42] Crowd Counting Using Adaptive Segmentation in a Congregation
    Sajid, Muhamad
    Hassan, Ali
    Khan, Shoab A.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2016, : 745 - 749
  • [43] Video-Based Vehicle Counting Framework
    Dai, Zhe
    Song, Huansheng
    Wang, Xuan
    Fang, Yong
    Yun, Xu
    Zhang, Zhaoyang
    Li, Huaiyu
    [J]. IEEE ACCESS, 2019, 7 : 64460 - 64470
  • [44] Characterization of different crowd behaviors using novel deep learning framework
    Alzahrani, Abdullah J.
    Khan, Sultan Daud
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (01) : 169 - 185
  • [45] Blind audio source counting and separation of anechoic mixtures using the multichannel complex NMF framework
    Mirzaei, Sayeh
    Van hamme, Hugo
    Norouzi, Yaser
    [J]. SIGNAL PROCESSING, 2015, 115 : 27 - 37
  • [46] Heterogeneous Networks for Audio and Video: Using IEEE 802.1 Audio Video Bridging
    Teener, Michael D. Johas
    Fredette, Andre N.
    Boiger, Christian
    Klein, Philippe
    Gunther, Craig
    Olsen, David
    Stanton, Kevin
    [J]. PROCEEDINGS OF THE IEEE, 2013, 101 (11) : 2339 - 2354
  • [47] Crowd Counting by Using Top-k Relations: A Mixed Ground-Truth CNN Framework
    Dong, Li
    Zhang, Haijun
    Yang, Kai
    Zhou, Dongliang
    Shi, Jianyang
    Ma, Jianghong
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2022, 68 (03) : 307 - 316
  • [48] A novel video fusion framework using surfacelet transform
    Zhang, Qiang
    Wang, Long
    Ma, Zhaokun
    Li, Huijuan
    [J]. OPTICS COMMUNICATIONS, 2012, 285 (13-14) : 3032 - 3041
  • [49] A Spatio-Temporal Attentive Network for Video-Based Crowd Counting
    Avvenuti, Marco
    Bongiovanni, Marco
    Ciampi, Luca
    Falchi, Fabrizio
    Gennaro, Claudio
    Messina, Nicola
    [J]. 2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [50] Feature-Aware Adaptation and Density Alignment for Crowd Counting in Video Surveillance
    Gao, Junyu
    Yuan, Yuan
    Wang, Qi
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (10) : 4822 - 4833