Video Synthesis via Transform-Based Tensor Neural Network

被引:7
|
作者
Zhang, Yimeng [1 ,2 ]
Liu, Xiao-Yang [1 ,2 ]
Wu, Bo [3 ]
Walid, Anwar [4 ]
机构
[1] Tensor & Deep Learning Lab, New York, NY USA
[2] Columbia Univ, New York, NY USA
[3] MIT IBM Watson AI Lab, Cambridge, MA 02142 USA
[4] Nokia Bell Labs, Murray Hill, NJ USA
关键词
Video synthesis; transform-based tensor; tensor neural network; interpolation and prediction; deep unfolding; STABILIZATION; SHRINKAGE;
D O I
10.1145/3394171.3413527
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video frame synthesis is an important task in computer vision and has drawn great interests in wide applications. However, existing neural network methods do not explicitly impose tensor low-rankness of videos to capture the spatiotemporal correlations in a high-dimensional space, while existing iterative algorithms require hand-crafted parameters and take relatively long running time. In this paper, we propose a novel multi-phase deep neural network Transform-Based Tensor-Net that exploits the low-rank structure of video data in a learned transform domain, which unfolds an Iterative Shrinkage-Thresholding Algorithm (ISTA) for tensor signal recovery. Our design is based on two observations: (i) both linear and nonlinear transforms can be implemented by a neural network layer, and (ii) the soft-thresholding operator corresponds to an activation function. Further, such an unfolding design is able to achieve nearly real-time at the cost of training time and enjoys an interpretable nature as a byproduct. Experimental results on the KTH and UCF-101 datasets show that compared with the state-of-the-art methods, i.e., DVF and Super SloMo, the proposed scheme improves Peak Signal-to-Noise Ratio (PSNR) of video interpolation and prediction by 4.13 dB and 4.26 dB, respectively.
引用
收藏
页码:2454 / 2462
页数:9
相关论文
共 50 条
  • [1] Tensor transform-based quaternion fourier transform algorithm
    Grigoryan, Artyom M.
    Agaian, Sos S.
    [J]. INFORMATION SCIENCES, 2015, 320 : 62 - 74
  • [2] A Wavelet Transform-Based Neural Network Denoising Algorithm for Mobile Phonocardiography
    Gradolewski, Dawid
    Magenes, Giovanni
    Johansson, Sven
    Kulesza, Wlodek J.
    [J]. SENSORS, 2019, 19 (04)
  • [3] Chebyshev Transform-Based Robust Trajectory Prediction Using Recurrent Neural Network
    Kwag, Sujin
    Kang, Byeongju
    Kim, Wonhee
    Hwang, Yunhyoung
    [J]. IEEE ACCESS, 2022, 10 : 130397 - 130405
  • [4] Adaptive lapped transform-based image and video coding
    Klausutis, TJ
    Madisetti, VK
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING '97, PTS 1-2, 1997, 3024 : 117 - 128
  • [5] Performance Measurement for a Wavelet Transform-based Video Compression
    Dhungel, Abinashi
    Weeks, Michael
    [J]. PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 216 - 220
  • [6] Hadamard Transform-Based Optimized HEVC Video Coding
    Tang, Minhao
    Chen, Xinyao
    Wen, Jiangtao
    Han, Yuxing
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (03) : 827 - 839
  • [7] A comparison of neural network and fast Fourier transform-based approach for the state analysis of brain
    Emoto, T
    Akutagawa, M
    Abeyratne, UR
    Nagashino, H
    Kinouchi, Y
    [J]. PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 94 - 99
  • [8] Deep demosaicking convolution neural network and quantum wavelet transform-based image denoising
    Chinnaiyan, Anitha Mary
    Alfred Sylam, Boyed Wesley
    [J]. NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2024,
  • [9] Wavelet packet transform-based robust video watermarking technique
    Bhatnagar, Gaurav
    Raman, Balasubrmanian
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2012, 37 (03): : 371 - 388
  • [10] Fractional Fourier Transform-Based Tensor RX for Hyperspectral Anomaly Detection
    Zhang, Lili
    Ma, Jiachen
    Cheng, Baozhi
    Lin, Fang
    [J]. REMOTE SENSING, 2022, 14 (03)