PANDA: A Gigapixel-level Human-centric Video Dataset

被引:44
|
作者
Wang, Xueyang [1 ]
Zhang, Xiya [1 ]
Zhu, Yinheng [1 ]
Guo, Yuchen [1 ]
Yuan, Xiaoyun [1 ]
Xiang, Liuyu [1 ]
Wang, Zerun [1 ]
Ding, Guiguang [1 ]
Brady, David [2 ]
Dai, Qionghai [1 ]
Fang, Lu [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Duke Univ, Durham, NC 27706 USA
关键词
DATA SET; ATTENTION; OBJECT; MODEL;
D O I
10.1109/CVPR42600.2020.00333
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (similar to 1 km(2) area) and high-resolution details (similar to gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.
引用
收藏
页码:3265 / 3275
页数:11
相关论文
共 50 条
  • [1] Towards real-time object detection in GigaPixel-level video
    Chen, Kai
    Wang, Zerun
    Wang, Xueyang
    Gong, Dahan
    Yu, Longlong
    Guo, Yuchen
    Ding, Guiguang
    NEUROCOMPUTING, 2022, 477 : 14 - 24
  • [2] Human-Centric Relation Segmentation: Dataset and Solution
    Liu, Si
    Wang, Zitian
    Gao, Yulu
    Ren, Lejian
    Liao, Yue
    Ren, Guanghui
    Li, Bo
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 4987 - 5001
  • [3] GIGAPIXEL-LEVEL IMAGE CROWD COUNTING USING CSRNET
    Cao, Zhijie
    Yan, Renyou
    Huang, Yiyong
    Shi, Zhiru
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 426 - 428
  • [4] PVDet: Towards pedestrian and vehicle detection on gigapixel-level images
    Mo, Wanghao
    Zhang, Wendong
    Wei, Hongyang
    Cao, Ruyi
    Ke, Yan
    Luo, Yiwen
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 118
  • [5] Music Conditioned Generation for Human-Centric Video
    Zhao, Zimeng
    Zuo, Binghui
    Wang, Yangang
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 506 - 510
  • [6] Toward human-centric deep video understanding
    Zeng, Wenjun
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
  • [7] Human-Centric Navigation System Video Vortex for Video Retrieval
    Haseyama, Miki
    Ogawa, Takahiro
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 167 - 168
  • [8] Speed up Object Detection on Gigapixel-level Images with Patch Arrangement
    Fan, Jiahao
    Liu, Huabin
    Yang, Wenjie
    See, John
    Zhang, Aixin
    Lin, Weiyao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4643 - 4651
  • [9] GigaHumanDet: Exploring Full-Body Detection on Gigapixel-Level Images
    Liu, Chenglong
    Wei, Haoran
    Yang, Jinze
    Liu, Jintao
    Li, Wenxi
    Guo, Yuchen
    Fang, Lu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10092 - 10100
  • [10] A synthetic human-centric dataset generation pipeline for active robotic vision
    Georgiadis, Charalampos
    Passalis, Nikolaos
    Nikolaidis, Nikos
    PATTERN RECOGNITION LETTERS, 2024, 179 : 17 - 23