Object Detection-Based Video Retargeting With Spatial-Temporal Consistency

被引：18

作者：

Lee, Seung Joon ^{[1
]}

Lee, Siyeong ^{[2
]}

Cho, Sung In ^{[3
]}

Kang, Suk-Ju ^{[1
]}

机构：

[1] Sogang Univ, Dept Elect Engn, Seoul 04107, South Korea

[2] NAVER LABS, Seongnam Si 13638, South Korea

[3] Dongguk Univ, Dept Multimedia Engn, Seoul 04620, South Korea

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2020年 / 30卷 / 12期

关键词：

Object detection; Object tracking; Distortion; Indexes; Computational complexity; Image sequences; Optimization; object tracking; video retargeting; convolutional neural network;

D O I：

10.1109/TCSVT.2020.2981652

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This study proposes a video retargeting method using deep neural network-based object detection. First, the meaningful regions of the input video denoted by bounding boxes of the object detection are extracted. In this case, the area is defined considering the size and number of bounding boxes for objects detected. The bounding boxes of each frame image are considered as regions of interest (RoIs). Second, the Siamese object tracking network is used to address high computational complexity of the object detection network. By dividing the video into scenes, object detection is performed for the first frame image of each scene to obtain the first bounding box. Object tracking is performed for the next sequential frame image until a scene change is detected. Third, the image is resized in the horizontal direction to alter the aspect ratio of the image and obtain the 1D RoIs of the image by projecting bounding boxes in the vertical direction. Then, the proposed method computes the grid map from the 1D RoIs to calculate new coordinates of each column data of the image. Finally, the retargeted video is obtained by rearranging all retargeted frame images. Comparative experiments conducted with various benchmark methods show an average bidirectional similarity score of 1.92, which is higher than other conventional methods. The proposed method was stable and satisfied viewers without causing cognitive discomfort as conventional methods.

引用

页码：4434 / 4439

页数：6

共 50 条

[41] Spatial-Temporal Structural and Dynamics Features for Video Fire Detection
Wang, Hongcheng
Finn, Alan
Erdinc, Ozgur
Vincitore, Antonio
[J]. 2013 IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION (WACV), 2013, : 513 - 519
[42] Video Quality Assessment Based on Spatial-temporal Distortion
Yang, Chunting
Liu, Yang
Yu, Jing
[J]. PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL I, 2009, : 818 - +
[43] Scene Cut Detection in Video by using Combination of Spatial-Temporal Video Characteristics
Jokovic, Jugoslav
Dordevic, Danilo
[J]. TELSIKS 2009, VOLS 1 AND 2, 2009, : 479 - 482
[44] Object Detection and Tracking of Unmanned Surface Vehicles Based on Spatial-temporal Information Fusion
Zhou Zhiguo
Jing Zhao
Wang Qiuling
Qu Chong
[J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1698 - 1705
[45] An Improved ViBe Moving Object Detection Algorithm based on Spatial-temporal Gradient of Image
Liu, Shanyi
[J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), VOL 1, 2016, : 192 - 197
[46] Visual tracking based on a unified tracking-and-detection framework with spatial-temporal consistency filtering
Fang, Yang
Ka, Seunghyun
Jo, Geun-Sik
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 80
[47] Weakly supervised video anomaly detection based on spatial-temporal feature fusion enhancement
Liang, Weijie
Zhang, Jianming
Zhan, Yongzhao
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1111 - 1118
[48] Moving target detection and labeling in video sequence based on spatial-temporal information fusion
Ma, Shiwei
Liu, Zhongjie
Yang, Banghua
Wang, Jian
[J]. BIO-INSPIRED COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2007, 4688 : 795 - 802
[49] Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
Ding, Zihan
Hui, Tianrui
Huang, Junshi
Wei, Xiaoming
Han, Jizhong
Liu, Si
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4954 - 4963
[50] A new spatial-temporal video object segmentation algorithm based on region compensation in fixed time interval
Zhu, Shiping
Lin, Yunyu
Zhang, Qingrong
[J]. 7TH INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND CONTROL TECHNOLOGY: MEASUREMENT THEORY AND SYSTEMS AND AERONAUTICAL EQUIPMENT, 2008, 7128

← 1 2 3 4 5 →