Label Distribution Learning for Video Summarization

被引:0
|
作者
Liu Y. [1 ]
Tang S. [1 ]
Gao Y. [1 ]
Li Z. [1 ]
Li H. [2 ,3 ]
机构
[1] College of Computer & Communication Engineering, China University of Petroleum, Qingdao
[2] Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing
[3] University of Chinese Academy of Sciences, Beijing
关键词
Key frame; Label distribution learning model; Multi-label learning; Video summarization;
D O I
10.3724/SP.J.1089.2019.17281
中图分类号
学科分类号
摘要
There is a problem of complicated model training in the supervised video digest algorithm. To solve this problem, a new video summary algorithm based on label distribution learning (LDL) is proposed. This algorithm uses non-parametric supervised learning to generate summarization. The main idea is to transfer summary structures from the annotated video to the same type of test video by label passing. Firstly, the convolutional neural network features and color features of the video are extracted. A feature matrix is obtained by combining these two features and reducing the dimension. It is then entered into the LDL model along with the label distribution of the training samples. Finally, the key frames are selected according to the label distribution of the model output, and they are composed into a video summary. By comparing the experiments with other algorithms on the benchmarks, it shows that the summaries generated by this algorithm are highly consistent with the hu-man-created abstract, which is obviously superior to other methods. © 2019, Beijing China Science Journal Publishing Co. Ltd. All right reserved.
引用
收藏
页码:104 / 110
页数:6
相关论文
共 27 条
  • [1] Heilbron F.C., Escorcia V., Ghanem B., Et al., ActivityNet: a large-scale video benchmark for human activity understanding, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 961-970, (2015)
  • [2] Lu Y., Bai X., Shapiro L., Et al., Coherent parametric contours for interactive video object segmentation, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 642-650, (2016)
  • [3] Gong B.Q., Chao W.L., Grauman K., Et al., Diverse sequential subset selection for supervised video summarization, Proceedings of the 28th Annual Conference on Neural Information Processing Systems, pp. 2069-2077, (2014)
  • [4] Lee Y.J., Ghosh J., Grauman K., Discovering important people and objects for egocentric video summarization, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1346-1353, (2012)
  • [5] Liu D., Hua G., Chen T., A hierarchical visual model for video object summarization, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 12, pp. 2178-2190, (2010)
  • [6] Lu Z., Grauman K., Story-driven summarization for egocentric video, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2714-2721, (2013)
  • [7] Nam J., Tewfik A.H., Event-driven video abstraction and visualization, Multimedia Tools and Applications, 16, 1-2, pp. 55-77, (2002)
  • [8] Furini M., Geraci F., Montangero M., Et al., STIMO: still and moving video storyboard for the web scenario, Multimedia Tools and Applications, 46, 1, pp. 47-69, (2010)
  • [9] Glodman D.B., Curless B., Salesin D., Et al., Schematic storyboarding for video visualization and editing, Processings of ACM Transactions on Graphics, pp. 862-871, (2006)
  • [10] Kang H.W., Matsushita Y., Tang X.O., Et al., Space-time video montage, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, pp. 1331-1338, (2006)