PANDA: A Gigapixel-level Human-centric Video Dataset

被引:44
|
作者
Wang, Xueyang [1 ]
Zhang, Xiya [1 ]
Zhu, Yinheng [1 ]
Guo, Yuchen [1 ]
Yuan, Xiaoyun [1 ]
Xiang, Liuyu [1 ]
Wang, Zerun [1 ]
Ding, Guiguang [1 ]
Brady, David [2 ]
Dai, Qionghai [1 ]
Fang, Lu [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Duke Univ, Durham, NC 27706 USA
关键词
DATA SET; ATTENTION; OBJECT; MODEL;
D O I
10.1109/CVPR42600.2020.00333
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (similar to 1 km(2) area) and high-resolution details (similar to gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.
引用
收藏
页码:3265 / 3275
页数:11
相关论文
共 50 条
  • [31] When Visual Grounding Meets Gigapixel-level Large-scale Scenes: Benchmark and Approach
    Ma, Tao
    Bai, Bing
    Lin, Haozhe
    Wang, Heyuan
    Wang, Yu
    Luo, Lin
    Fang, Lu
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 22119 - 22128
  • [32] The Human-Centric Metaverse: A Survey
    Yang, Riyan
    Li, Lin
    Gan, Wensheng
    Chen, Zefeng
    Qi, Zhenlian
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1296 - 1306
  • [33] Overview of Human-Centric Computing
    Iida, Ichiro
    Morita, Toshihiko
    FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 2012, 48 (02): : 124 - 128
  • [34] TOWARD A HUMAN-CENTRIC INTERNET
    West, Jessamyn
    LIBRARY JOURNAL, 2010, 135 (02) : 24 - 25
  • [35] Human-Centric Image Captioning
    Yang, Zuopeng
    Wang, Pengbo
    Chu, Tianshu
    Yang, Jie
    Pattern Recognition, 2022, 126
  • [37] Human-Centric Image Captioning
    Yang, Zuopeng
    Wang, Pengbo
    Chu, Tianshu
    Yang, Jie
    PATTERN RECOGNITION, 2022, 126
  • [38] GigaMVS: A Benchmark for Ultra-Large-Scale Gigapixel-Level 3D Reconstruction
    Zhang, Jianing
    Zhang, Jinzhi
    Mao, Shi
    Ji, Mengqi
    Wang, Guangyu
    Chen, Zequn
    Zhang, Tian
    Yuan, Xiaoyun
    Dai, Qionghai
    Fang, Lu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 7534 - 7550
  • [39] Industry 5 and the Human in Human-Centric Manufacturing
    Briken, Kendra
    Moore, Jed
    Scholarios, Dora
    Rose, Emily
    Sherlock, Andrew
    SENSORS, 2023, 23 (14)
  • [40] Towards TRUE human-centric computation
    Rabaey, Jan M.
    COMPUTER COMMUNICATIONS, 2018, 131 : 73 - 76