A novel framework for crowd counting using video and audio

被引:1
|
作者
Zou, Yi [1 ]
Min, Weidong [1 ,2 ,3 ]
Zhao, Haoyu [1 ]
Han, Qing [1 ]
机构
[1] Nanchang Univ, Sch Math & Comp Sci, Nanchang 330031, Peoples R China
[2] Nanchang Univ, Inst Metaverse, Nanchang 330031, Peoples R China
[3] Jiangxi Key Lab Smart City, Nanchang 330031, Peoples R China
基金
中国国家自然科学基金;
关键词
Crowd counting; VACCNet; Video Crowd Counting; Multiple direction audio assistance;
D O I
10.1016/j.compeleceng.2023.108754
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Crowd counting is significant in many areas. The existing methods have poor accuracy for perspective scenes and low illumination scenes. Besides, the existing audio-assisted methods only use local audio, which fails to provide the spatial feature information of sound in all directions in space. To alleviate the above problems, a novel framework named Video and Audio-assisted Crowd Counting Network (VACCNet) is proposed. The framework consists of two submodules: Video Crowd Counting (VCC) module and Audio-assisted Crowd Counting (ACC) module. The visual features from the VCC module and the fused audio features from the ACC module are further combined to get the final density map. To prove the effects of VACCNet, a new self-collected dataset named multiPle dIrection Assistance couNting netwOrk (PIANO) is built. The experimental results based on existing benchmarks and PIANO show that the proposed method has a 14.23% improvement averagely to the conventional methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A Novel Approach for People Counting and Tracking from Crowd Video
    Sagun, M. Ayyuce Kizrak
    Bolat, Bulent
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2017, : 277 - 281
  • [2] A Crowd Counting Framework Combining with Crowd Location
    Zhang, Jin
    Chen, Sheng
    Tian, Sen
    Gong, Wenan
    Cai, Guoshan
    Wang, Ying
    [J]. JOURNAL OF ADVANCED TRANSPORTATION, 2021, 2021
  • [3] Audio–Video based People Counting and Security Framework for Traffic Crossings
    Ankush Mittal
    Ankur Jain
    Ganesh K. Agarwal
    [J]. The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, 2007, 49 : 377 - 391
  • [4] Audio-video based people counting and security framework for traffic crossings
    Mittal, Ankush
    Jain, Ankur
    Agarwal, Ganesh K.
    [J]. JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2007, 49 (03): : 377 - 391
  • [5] A crowd video retrieval framework using generic descriptors
    Wong, Pei Voon
    Mustapha, Norwati
    Affendey, Lilly Suriani
    Khalid, Fatimah
    [J]. Journal of Computers (Taiwan), 2020, 31 (01) : 34 - 45
  • [6] Audio-Visual Transformer Based Crowd Counting
    Sajid, Usman
    Chen, Xiangyu
    Sajid, Hasan
    Kim, Taejoon
    Wang, Guanghui
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 2249 - 2259
  • [7] Frame-Recurrent Video Crowd Counting
    Hou, Yi
    Zhang, Shanghang
    Ma, Rui
    Jia, Huizhu
    Xie, Xiaodong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5186 - 5199
  • [8] TRIPLE ATTENTION FOR ROBUST VIDEO CROWD COUNTING
    Wu, Qiyao
    Zhang, Chongyang
    Kong, Xiyu
    Zhao, Muming
    Chen, Yanjun
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1966 - 1970
  • [9] AVMSN: An Audio-Visual Two Stream Crowd Counting Framework Under Low-Quality Conditions
    Hu, Ruihan
    Mo, Qinglong
    Xie, Yuanfei
    Xu, Yongqian
    Chen, Jiaqi
    Yang, Yalun
    Zhou, Hongjian
    Tang, Zhi-Ri
    Wu, Edmond Q.
    [J]. IEEE ACCESS, 2021, 9 : 80500 - 80510
  • [10] AVMSN: An Audio-Visual Two Stream Crowd Counting Framework under Low-Quality Conditions
    Hu, Ruihan
    Mo, Qinglong
    Xie, Yuanfei
    Xu, Yongqian
    Chen, Jiaqi
    Yang, Yalun
    Zhou, Hongjian
    Tang, Zhi-Ri
    Wu, Edmond Q.
    [J]. IEEE Access, 2021, 9 : 80500 - 80510