Steered Mixture-of-Experts for Light Field Images and Video: Representation and Coding

被引：24

作者：

Verhack, Ruben ^{[1
,2
]}

Sikora, Thomas ^{[2
]}

Van Wallendael, Glenn ^{[1
]}

Lambert, Peter ^{[1
]}

机构：

[1] Univ Ghent, IDLab, IMEC, B-9052 Ghent, Belgium

[2] Tech Univ Berlin, Commun Syst Grp, D-10623 Berlin, Germany

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2020年 / 22卷 / 03期

关键词：

Kernel; Encoding; Cameras; Image coding; Solid modeling; Image reconstruction; Image resolution; Mixture of experts; light fields; mixture models; sparse representation; bayesian modeling; QUALITY ASSESSMENT; MULTIVIEW;

D O I：

10.1109/TMM.2019.2932614

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution.

引用

页码：579 / 593

页数：15

共 50 条

[31] Light field image compression using Versatile Video Coding
Avramelos, Vasileios
De Praeter, Johan
Van Wallendael, Glenn
Lambert, Peter
[J]. 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE-BERLIN), 2019, : 70 - 75
[32] SCALABLE LIGHT FIELD CODING USING WEIGHTED BINARY IMAGES
Komatsu, Koji
Takahashi, Keita
Fujii, Toshiaki
[J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 903 - 907
[33] A Light Field Sparse Representation Structure and Its Fast Coding Technique
Chen, Jie
Matyasko, Alexander
Chau, Lap-Pui
[J]. 2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 214 - 218
[34] Lossless coding of light field images based on minimum-rate predictors
Santos, Joao M.
Assuncao, Pedro A. A.
da Silva Cruz, Luis A.
Tavora, Luis M. N.
Fonseca-Pinto, Rui
Faria, Sergio M. M.
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 54 : 21 - 30
[35] Super-Resolution Reconstruction of Light Field Images via Sparse Representation
Ge Peng
You Yaotang
[J]. LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (02)
[36] Scale-Invariant Representation of Light Field Images for Object Recognition and Tracking
Ghasemi, Alireza
Vetterli, Andmartin
[J]. COMPUTATIONAL IMAGING XII, 2014, 9020
[37] LIGHT-FIELD VIDEO CODING USING GEOMETRY-BASED DISPARITY COMPENSATION
Conti, Caroline
Kovacs, Peter Tamas
Balogh, Tibor
Nunes, Paulo
Soares, Luis Ducla
[J]. 2014 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2014,
[38] Random access prediction structures for light field video coding with MV-HEVC
Avramelos, Vasileios
De Praeter, Johan
Van Wallendael, Glenn
Lambert, Peter
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (19-20) : 12847 - 12867
[39] Random access prediction structures for light field video coding with MV-HEVC
Vasileios Avramelos
Johan De Praeter
Glenn Van Wallendael
Peter Lambert
[J]. Multimedia Tools and Applications, 2020, 79 : 12847 - 12867
[40] An Interactive Light Field Video System with User-Dependent View Selection and Coding Scheme
Wang, Bing
Peng, Qiang
Wu, Xiao
Wang, Eric
Xiang, Andwei
[J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 727 - 736

← 1 2 3 4 5 →