Image-to-Video Translation Using a VAE-GAN with Refinement Network

被引:0
|
作者
Wang, Shengli [1 ]
Xieshi, Mulin [2 ]
Zhou, Zhangpeng [1 ]
Zhang, Xiang [2 ]
Liu, Xujie [2 ]
Tang, Zeyi [2 ]
Xiahou, Jianbing [3 ]
Lin, Pingyuan [3 ]
Xu, Xuexin [3 ]
Dai, Yuxing [3 ]
机构
[1] Maintenance Co State Grid Power Co Gansu Prov, Lanzhou 730000, Gansu, Peoples R China
[2] State Grid Infotelecom Great Power Sci & Technol, Fuzhou 350000, Peoples R China
[3] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China
关键词
Video generation; Variational autoencoder; Generative Adversarial Network; Refinement network;
D O I
10.1007/978-3-031-13870-6_42
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the development of deep learning technology, various techniques for image processing have emerged in the field of computer vision in recent years, and have excellent performance in a variety of application scenarios. In contrast to the prediction task of predicting video with multiple consecutive frames before and after the input to predict the missing images in the middle, the task of image-to-video generation proposed in this paper does not require multiple consecutive frames, but rather the directional content generation of images by inputting the first frame image with the embedding vector of motion features, and to address some of the existing problems, this paper innovates the network architecture to solve the generated video problems such as incoherence, frame loss and blurring. For multiple image-to-video translation tasks, we propose a VAE-RGAN network with a further refinement network. We add a refinement network and use new identity matching loss and connected feature matching loss to eliminate VAE and GAN's respective shortcomings and enhance the visual quality of the generated videos. Weizmann datasets have been the subject of a wide range of qualitative and quantitative experiments. We draw the following conclusions from this empirical study: (1) Compared with state-of-the-art approaches, our approach (VAE-RGAN) exhibits significant improvements in generative capability; (2) Experiments shows that our designed VAE-RGAN structure achieves better results and the refinement network significantly improves the problems of a blur.
引用
收藏
页码:494 / 505
页数:12
相关论文
共 50 条
  • [21] Towards Image-to-Video Translation: A Structure-Aware Approach via Multi-stage Generative Adversarial Networks
    Zhao, Long
    Peng, Xi
    Tian, Yu
    Kapadia, Mubbasir
    Metaxas, Dimitris N.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (10-11) : 2514 - 2533
  • [22] Cross-Media Body-Part Attention Network for Image-to-Video Person Re-Identification
    Yu, Benzhi
    Xu, Ning
    Zhou, Jian
    IEEE ACCESS, 2019, 7 : 94966 - 94976
  • [23] Optical and SAR images-based image translation for change detection using generative adversarial network (GAN)
    Manocha, Ankush
    Afaq, Yasir
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (17) : 26289 - 26315
  • [24] Optical and SAR images-based image translation for change detection using generative adversarial network (GAN)
    Ankush Manocha
    Yasir Afaq
    Multimedia Tools and Applications, 2023, 82 : 26289 - 26315
  • [25] Image completion using structure and texture GAN network
    Guo, Jingtao
    Liu, Yi
    NEUROCOMPUTING, 2019, 360 (75-84) : 75 - 84
  • [26] Unsupervised Image Translation Using Multi-Scale Residual GAN
    Zhang, Yifei
    Li, Weipeng
    Wang, Daling
    Feng, Shi
    MATHEMATICS, 2022, 10 (22)
  • [27] LC-GAN: Image-to-image Translation Based on Generative Adversarial Network for Endoscopic Images
    Lin, Shan
    Qin, Fangbo
    Li, Yangming
    Bly, Randall A.
    Moe, Kris S.
    Hannaford, Blake
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 2914 - 2920
  • [28] Image-to-Image Translation using a Relativistic Generative Adversarial Network
    Xing, Xingrun
    Zhang, Dawei
    ELEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2019), 2019, 11179
  • [29] Diverse Audio-to-Video GAN using Multiscale Image Fusion
    Aldausari, Nuha
    Sowmya, Arcot
    Marcus, Nadine
    Mohammadi, Gelareh
    AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13728 : 29 - 42
  • [30] Image and video dehazing based on transmission estimation and refinement using Jaya algorithm
    Ashwini, K.
    Nenavath, Hathiram
    Jatoth, Ravi Kumar
    OPTIK, 2022, 265