PANDA: A Gigapixel-level Human-centric Video Dataset

被引：44

作者：

Wang, Xueyang ^{[1
]}

Zhang, Xiya ^{[1
]}

Zhu, Yinheng ^{[1
]}

Guo, Yuchen ^{[1
]}

Yuan, Xiaoyun ^{[1
]}

Xiang, Liuyu ^{[1
]}

Wang, Zerun ^{[1
]}

Ding, Guiguang ^{[1
]}

Brady, David ^{[2
]}

Dai, Qionghai ^{[1
]}

Fang, Lu ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Duke Univ, Durham, NC 27706 USA

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年

关键词：

DATA SET; ATTENTION; OBJECT; MODEL;

D O I：

10.1109/CVPR42600.2020.00333

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (similar to 1 km(2) area) and high-resolution details (similar to gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.

引用

页码：3265 / 3275

页数：11

共 50 条

[1] Towards real-time object detection in GigaPixel-level video
Chen, Kai
Wang, Zerun
Wang, Xueyang
Gong, Dahan
Yu, Longlong
Guo, Yuchen
Ding, Guiguang
NEUROCOMPUTING, 2022, 477 : 14 - 24
[2] Human-Centric Relation Segmentation: Dataset and Solution
Liu, Si
Wang, Zitian
Gao, Yulu
Ren, Lejian
Liao, Yue
Ren, Guanghui
Li, Bo
Yan, Shuicheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 4987 - 5001
[3] GIGAPIXEL-LEVEL IMAGE CROWD COUNTING USING CSRNET
Cao, Zhijie
Yan, Renyou
Huang, Yiyong
Shi, Zhiru
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 426 - 428
[4] PVDet: Towards pedestrian and vehicle detection on gigapixel-level images
Mo, Wanghao
Zhang, Wendong
Wei, Hongyang
Cao, Ruyi
Ke, Yan
Luo, Yiwen
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 118
[5] Music Conditioned Generation for Human-Centric Video
Zhao, Zimeng
Zuo, Binghui
Wang, Yangang
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 506 - 510
[6] Toward human-centric deep video understanding
Zeng, Wenjun
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
[7] Human-Centric Navigation System Video Vortex for Video Retrieval
Haseyama, Miki
Ogawa, Takahiro
IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 167 - 168
[8] Speed up Object Detection on Gigapixel-level Images with Patch Arrangement
Fan, Jiahao
Liu, Huabin
Yang, Wenjie
See, John
Zhang, Aixin
Lin, Weiyao
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4643 - 4651
[9] GigaHumanDet: Exploring Full-Body Detection on Gigapixel-Level Images
Liu, Chenglong
Wei, Haoran
Yang, Jinze
Liu, Jintao
Li, Wenxi
Guo, Yuchen
Fang, Lu
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10092 - 10100
[10] A synthetic human-centric dataset generation pipeline for active robotic vision
Georgiadis, Charalampos
Passalis, Nikolaos
Nikolaidis, Nikos
PATTERN RECOGNITION LETTERS, 2024, 179 : 17 - 23

← 1 2 3 4 5 →