Sparse Spatio-Temporal Representation With Adaptive Regularized Dictionary Learning for Low Bit-Rate Video Coding

被引:23
|
作者
Xiong, Hongkai [1 ]
Pan, Zhiming [1 ]
Ye, Xinwei [1 ]
Chen, Chang Wen [2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Elect Engn, Shanghai 200240, Peoples R China
[2] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14260 USA
基金
中国国家自然科学基金;
关键词
Atom decomposition; dictionary learning; primitive patch; sparse representation; video coding; BLOCK MOTION COMPENSATION; IMAGE QUALITY ASSESSMENT; SUPERRESOLUTION; ALGORITHM;
D O I
10.1109/TCSVT.2012.2221271
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
For promising vision-based video coding on low-quality data, this paper proposes a sparse spatio-temporal representation with adaptive regularized dictionary learning and develops a low bit-rate video coding scheme. In a reversed-complexity Wyner-Ziv coding manner, it selects a subset of key frames to code at original resolution, while the rest are down sampled and reconstructed by a sparse spatio-temporal approximation using key frames as a training dataset. Since primitive patches (geometry) are of low dimensionality and can be well learned from the primitive patches across frames in a scale space, a video frame is divided into three layers: a primitive layer, a nonprimitive coarse layer, and a nonprimitive smooth layer. The multiscale differential feature representations are invertible to facilitate reconstruction with dictionary learning, and the target is formulated as an optimization problem by constructing a sparse representation of 2-D patches and 3-D volumes over adaptive regularized dictionaries, a set of 2-D subdictionary pairs trained from primitive patches, and a 3-D dictionary trained from nonprimitive volumes. Specifically, the nonprimitive layer is constructed as volumes in to order keep it consistent along the motion trajectory, which enables sparse representations over a learned 3-D spatio-temporal dictionary. Through hierarchical bidirectional motion estimation and adaptive overlapped block motion compensation, the 3-D low-frequency and high-frequency dictionary pair is designed by the K-SVD algorithm to update the atoms for optimal sparse representation and convergence. In reconstruction, the lost high-frequency information of the down-sampled frames can be synthesized from the sparse spatio-temporal representation over the adaptive regularized dictionaries. Extensive experiments validate the compression efficiency of the proposed scheme versus H.264/AVC in terms of both objective and subjective comparisons.
引用
收藏
页码:710 / 728
页数:19
相关论文
共 50 条
  • [1] STOL: Spatio-Temporal Online Dictionary Learning for low bit-rate video coding
    Tang, Xin
    Xiong, Hongkai
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 522 - 522
  • [2] Low bit-rate video coding with spatio-temporal geometric transforms
    deFaria, SMM
    Ghanbari, M
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1996, 143 (03): : 164 - 170
  • [3] Adaptive frame skipping based on spatio-temporal complexity for low bit-rate video coding
    Pan, F.
    Lin, Z. P.
    Lin, X.
    Rahardja, S.
    Juwono, W.
    Slamet, F.
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2006, 17 (03) : 554 - 563
  • [4] Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation
    Irannejad, Maziar
    Mahdavi-Nasab, Homayoun
    [J]. MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2020, 31 (02) : 465 - 489
  • [5] Sparse Representation With Spatio-Temporal Online Dictionary Learning for Promising Video Coding
    Dai, Wenrui
    Shen, Yangmei
    Tang, Xin
    Zou, Junni
    Xiong, Hongkai
    Chen, Chang Wen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (10) : 4580 - 4595
  • [6] Low bit-rate SNR scalable video coding based on overcomplete dictionary learning and sparse representation
    Maziar Irannejad
    Homayoun Mahdavi-Nasab
    [J]. Multidimensional Systems and Signal Processing, 2020, 31 : 465 - 489
  • [7] Sparse Spatio-Temporal Representation with Adaptive Regularized Dictionaries for Super-Resolution Based Video Coding
    Pan, Zhiming
    Xiong, Hongkai
    [J]. 2012 DATA COMPRESSION CONFERENCE (DCC), 2012, : 139 - 148
  • [8] Efficient spatio-temporal segmentation for very low bit rate video coding
    Handcock, J
    Canagarajah, N
    Tellert, W
    Bull, D
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING '98, PTS 1 AND 2, 1997, 3309 : 544 - 551
  • [9] Progressive Dictionary Learning With Hierarchical Predictive Structure for Low Bit-Rate Scalable Video Coding
    Dai, Wenrui
    Shen, Yangmei
    Xiong, Hongkai
    Jiang, Xiaoqian
    Zou, Junni
    Taubman, David
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (06) : 2972 - 2987
  • [10] Spatio-temporal segmentation of image sequences for object-oriented low bit-rate image coding
    Wu, L
    BenoisPineau, J
    Delagnes, P
    Barba, D
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 1996, 8 (06) : 513 - 543