A Bag-of-Importance Model With Locality-Constrained Coding Based Feature Learning for Video Summarization

被引:62
|
作者
Lu, Shiyang [1 ]
Wang, Zhiyong [1 ]
Mei, Tao [2 ]
Guan, Genliang [1 ]
Feng, David Dagan [1 ]
机构
[1] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia
[2] Microsoft Res, Beijing 100080, Peoples R China
关键词
Locality-constrained linear coding; sparse coding; video summarization; FRAMEWORK; SELECTION;
D O I
10.1109/TMM.2014.2319778
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video summarization helps users obtain quick comprehension of video content. Recently, some studies have utilized local features to represent each video frame and formulate video summarization as a coverage problem of local features. However, the importance of individual local features has not been exploited. In this paper, we propose a novel Bag-of-Importance (BoI) model for static video summarization by identifying the frames with important local features as keyframes, which is one of the first studies formulating video summarization at local feature level, instead of at global feature level. That is, by representing each frame with local features, a video is characterized with a bag of local features weighted with individual importance scores and the frames with more important local features are more representative, where the representativeness of each frame is the aggregation of the weighted importance of the local features contained in the frame. In addition, we propose to learn a transformation from a raw local feature to a more powerful sparse nonlinear representation for deriving the importance score of each local feature, rather than directly utilize the hand-crafted visual features like most of the existing approaches. Specifically, we first employ locality-constrained linear coding (LCC) to project each local feature into a sparse transformed space. LCC is able to take advantage of the manifold geometric structure of the high dimensional feature space and form the manifold of the low dimensional transformed space with the coordinates of a set of anchor points. Then we calculate the norm of each anchor point as the importance score of each local feature which is projected to the anchor point. Finally, the distribution of the importance scores of all the local features in a video is obtained as the BoI representation of the video. We further differentiate the importance of local features with a spatial weighting template by taking the perceptual difference among spatial regions of a frame into account. As a result, our proposed video summarization approach is able to exploit both the inter-frame and intra-frame properties of feature representations and identify keyframes capturing both the dominant content and discriminative details within a video. Experimental results on three video datasets across various genres demonstrate that the proposed approach clearly outperforms several state-of-the-art methods.
引用
收藏
页码:1497 / 1509
页数:13
相关论文
共 50 条
  • [41] Classification of time-frequency images based on locality-constrained linear coding optimization model for rotating machinery fault diagnosis
    Wang, Wei-Gang
    Liu, Zhan-Sheng
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2015, 229 (18) : 3350 - 3360
  • [42] Computing object-based saliency via locality-constrained linear coding and conditional random fields
    Yang, Zhen
    Xiong, Huilin
    VISUAL COMPUTER, 2017, 33 (11): : 1403 - 1413
  • [43] Optical Music Recognition Based On Locality-constrained Linear Coding and Double Distribution Support Vector Machine
    Zeng, XiaXia
    Cheng, Fanyong
    Lin, Wenzhong
    Luo, Haibo
    Ruan, Zhiqiang
    2017 IEEE 5TH INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY (EMC-BEIJING), 2017,
  • [44] Object Tracking based on Locality-constrained Linear CodingJoint Sparse Representation Appearance Model
    Li, Feibin
    Cao, Tieyong
    Song, Zhijun
    Wang, Wen
    2015 IEEE 16TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2015, : 892 - 896
  • [45] Endoscopy Video Summarization based on Unsupervised Learning and Feature Discrimination
    Ben Ismail, M. Maher
    Bchir, Ouiem
    Emam, Ahmed Z.
    2013 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP 2013), 2013,
  • [46] Context-Patch Face Hallucination Based on Thresholding Locality-Constrained Representation and Reproducing Learning
    Jiang, Junjun
    Yu, Yi
    Tang, Suhua
    Ma, Jiayi
    Aizawa, Akiko
    Aizawa, Kiyoharu
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (01) : 324 - 337
  • [47] Intestinal Polyp Recognition Based on Salient Codebook Locality-Constrained Linear Coding with Annular Spatial Pyramid Matching
    Dongwei He
    Sheng Li
    Xiongxiong He
    Liping Chang
    Ni Zhang
    Qianru Jiang
    Journal of Medical and Biological Engineering, 2020, 40 : 473 - 483
  • [48] Bearing Fault Diagnosis Based on Improved Locality-Constrained Linear Coding and Adaptive PSO-Optimized SVM
    Yuan, Haodong
    Chen, Jin
    Dong, Guangming
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [49] Intestinal Polyp Recognition Based on Salient Codebook Locality-Constrained Linear Coding with Annular Spatial Pyramid Matching
    He, Dongwei
    Li, Sheng
    He, Xiongxiong
    Chang, Liping
    Zhang, Ni
    Jiang, Qianru
    JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2020, 40 (04) : 473 - 483
  • [50] Unsupervised Video Summarization Based on the Diffusion Model of Feature Fusion
    Yu, Qinghao
    Yu, Hui
    Sun, Ying
    Ding, Derui
    Jian, Muwei
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 1 - 12