Steered Mixture-of-Experts for Light Field Images and Video: Representation and Coding

被引:24
|
作者
Verhack, Ruben [1 ,2 ]
Sikora, Thomas [2 ]
Van Wallendael, Glenn [1 ]
Lambert, Peter [1 ]
机构
[1] Univ Ghent, IDLab, IMEC, B-9052 Ghent, Belgium
[2] Tech Univ Berlin, Commun Syst Grp, D-10623 Berlin, Germany
关键词
Kernel; Encoding; Cameras; Image coding; Solid modeling; Image reconstruction; Image resolution; Mixture of experts; light fields; mixture models; sparse representation; bayesian modeling; QUALITY ASSESSMENT; MULTIVIEW;
D O I
10.1109/TMM.2019.2932614
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution.
引用
收藏
页码:579 / 593
页数:15
相关论文
共 50 条
  • [31] Light field image compression using Versatile Video Coding
    Avramelos, Vasileios
    De Praeter, Johan
    Van Wallendael, Glenn
    Lambert, Peter
    [J]. 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE-BERLIN), 2019, : 70 - 75
  • [32] SCALABLE LIGHT FIELD CODING USING WEIGHTED BINARY IMAGES
    Komatsu, Koji
    Takahashi, Keita
    Fujii, Toshiaki
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 903 - 907
  • [33] A Light Field Sparse Representation Structure and Its Fast Coding Technique
    Chen, Jie
    Matyasko, Alexander
    Chau, Lap-Pui
    [J]. 2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 214 - 218
  • [34] Lossless coding of light field images based on minimum-rate predictors
    Santos, Joao M.
    Assuncao, Pedro A. A.
    da Silva Cruz, Luis A.
    Tavora, Luis M. N.
    Fonseca-Pinto, Rui
    Faria, Sergio M. M.
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 54 : 21 - 30
  • [35] Super-Resolution Reconstruction of Light Field Images via Sparse Representation
    Ge Peng
    You Yaotang
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (02)
  • [36] Scale-Invariant Representation of Light Field Images for Object Recognition and Tracking
    Ghasemi, Alireza
    Vetterli, Andmartin
    [J]. COMPUTATIONAL IMAGING XII, 2014, 9020
  • [37] LIGHT-FIELD VIDEO CODING USING GEOMETRY-BASED DISPARITY COMPENSATION
    Conti, Caroline
    Kovacs, Peter Tamas
    Balogh, Tibor
    Nunes, Paulo
    Soares, Luis Ducla
    [J]. 2014 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2014,
  • [38] Random access prediction structures for light field video coding with MV-HEVC
    Avramelos, Vasileios
    De Praeter, Johan
    Van Wallendael, Glenn
    Lambert, Peter
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (19-20) : 12847 - 12867
  • [39] Random access prediction structures for light field video coding with MV-HEVC
    Vasileios Avramelos
    Johan De Praeter
    Glenn Van Wallendael
    Peter Lambert
    [J]. Multimedia Tools and Applications, 2020, 79 : 12847 - 12867
  • [40] An Interactive Light Field Video System with User-Dependent View Selection and Coding Scheme
    Wang, Bing
    Peng, Qiang
    Wu, Xiao
    Wang, Eric
    Xiang, Andwei
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 727 - 736