Key-Frame Extraction for Reducing Human Effort in Object Detection Training for Video Surveillance

被引：2

作者：

Sinulingga, Hagai R. ^{[1
]}

Kong, Seong G. ^{[1
]}

机构：

[1] Sejong Univ, Dept Comp Engn, Seoul 05006, South Korea

来源：

ELECTRONICS | 2023年 / 12卷 / 13期

关键词：

object detection; video surveillance; key-frame extraction; interactive labeling; deep learning; LOCALIZATION;

D O I：

10.3390/electronics12132956

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a supervised learning scheme that employs key-frame extraction to enhance the performance of pre-trained deep learning models for object detection in surveillance videos. Developing supervised deep learning models requires a significant amount of annotated video frames as training data, which demands substantial human effort for preparation. Key frames, which encompass frames containing false negative or false positive objects, can introduce diversity into the training data and contribute to model improvements. Our proposed approach focuses on detecting false negatives by leveraging the motion information within video frames that contain the detected object region. Key-frame extraction significantly reduces the human effort involved in video frame extraction. We employ interactive labeling to annotate false negative video frames with accurate bounding boxes and labels. These annotated frames are then integrated with the existing training data to create a comprehensive training dataset for subsequent training cycles. Repeating the training cycles gradually improves the object detection performance of deep learning models to monitor a new environment. Experiment results demonstrate that the proposed learning approach improves the performance of the object detection model in a new operating environment, increasing the mean average precision (mAP@0.5) from 54% to 98%. Manual annotation of key frames is reduced by 81% through the proposed key-frame extraction method.

引用

页数：14

共 50 条

[1] Key-frame extraction for object-based video segmentation
Song, XM
Fan, GL
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 689 - 692
[2] Combined key-frame extraction and object-based video segmentation
Liu, LJ
Fan, GL
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2005, 15 (07) : 869 - 884
[3] Video key-frame extraction for smart phones
Liu, Huaping
Liu, Yulong
Sun, Fuchun
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (04) : 2031 - 2049
[4] Efficient key-frame extraction and video analysis
Calic, J
Izquierdo, E
INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, PROCEEDINGS, 2002, : 28 - 33
[5] Video key-frame extraction for smart phones
Huaping Liu
Yulong Liu
Fuchun Sun
Multimedia Tools and Applications, 2016, 75 : 2031 - 2049
[6] Dynamic key-frame extraction for video summarization
Ciocca, G
Schettini, R
INTERNET IMAGING VI, 2005, 5670 : 137 - 142
[7] Key Frame Extraction of Surveillance Video based on Moving Object Detection and Image Similarity
Luo Y.
Zhou H.
Tan Q.
Chen X.
Yun M.
Pattern Recognition and Image Analysis, 2018, 28 (2) : 225 - 231
[8] KEY-FRAME EXTRACTION AND KEY-FRAME RATE DETERMINATION USING HUMAN ATTENTION MODELING
Shih, Huang-Chia
2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
[9] Wildlife video key-frame extraction based on novelty detection in semantic context
Yong, Suet-Peng
Deng, Jeremiah D.
Purvis, Martin K.
MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 62 (02) : 359 - 376
[10] Joint key-frame extraction and object segmentation for content-based video analysis
Song, Xiaomu
Fan, Guoliang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2006, 16 (07) : 904 - 914

← 1 2 3 4 5 →