PANDA: A Gigapixel-level Human-centric Video Dataset

被引:44
|
作者
Wang, Xueyang [1 ]
Zhang, Xiya [1 ]
Zhu, Yinheng [1 ]
Guo, Yuchen [1 ]
Yuan, Xiaoyun [1 ]
Xiang, Liuyu [1 ]
Wang, Zerun [1 ]
Ding, Guiguang [1 ]
Brady, David [2 ]
Dai, Qionghai [1 ]
Fang, Lu [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Duke Univ, Durham, NC 27706 USA
关键词
DATA SET; ATTENTION; OBJECT; MODEL;
D O I
10.1109/CVPR42600.2020.00333
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (similar to 1 km(2) area) and high-resolution details (similar to gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.
引用
收藏
页码:3265 / 3275
页数:11
相关论文
共 50 条
  • [11] Moving object detection in gigapixel-level videos using manifold sparse representation
    Liu, Jingjing
    Feng, Manlong
    Gu, Dongzhou
    Zeng, Xiaoyang
    Liu, Wanquan
    Xiu, Xianchao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 18381 - 18405
  • [12] A Unified Framework for Human-centric Point Cloud Video Understanding
    Xu, Yiteng
    Ye, Kecheng
    Han, Xiao
    Ren, Yiming
    Zhu, Xinge
    Ma, Yuexin
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 1155 - 1164
  • [13] EgoEnv: Human-centric environment representations from egocentric video
    Nagarajan, Tushar
    Ramakrishnan, Santhosh Kumar
    Desai, Ruta
    Hillis, James
    Grauman, Kristen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [14] Moving object detection in gigapixel-level videos using manifold sparse representation
    Jingjing Liu
    Manlong Feng
    Dongzhou Gu
    Xiaoyang Zeng
    Wanquan Liu
    Xianchao Xiu
    Multimedia Tools and Applications, 2024, 83 : 18381 - 18405
  • [15] Human-centric sensing
    Srivastava, Mani
    Abdelzaher, Tarek
    Szymanski, Boleslaw
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2012, 370 (1958): : 176 - 197
  • [16] Human-Centric Computing
    Rabaey, Jan M.
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) : 3 - 11
  • [17] The Human-Centric SMED
    Fonda, Edoardo
    Meneghetti, Antonella
    SUSTAINABILITY, 2022, 14 (01)
  • [18] Human-Centric Computing
    Rabaey, Jan M.
    2021 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2021,
  • [19] HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media
    Chatzitofis, Anargyros
    Saroglou, Leonidas
    Boutis, Prodromos
    Drakoulis, Petros
    Zioulis, Nikolaos
    Subramanyam, Shishir
    Kevelham, Bart
    Charbonnier, Caecilia
    Cesar, Pablo
    Zarpalas, Dimitrios
    Kollias, Stefanos
    Daras, Petros
    IEEE ACCESS, 2020, 8 (08): : 176241 - 176262
  • [20] Human-centric assembly
    Tracht, Kirsten
    Weidner, Robert
    WT Werkstattstechnik, 2023, 113 (09):