Image-to-Video Translation Using a VAE-GAN with Refinement Network

被引:0
|
作者
Wang, Shengli [1 ]
Xieshi, Mulin [2 ]
Zhou, Zhangpeng [1 ]
Zhang, Xiang [2 ]
Liu, Xujie [2 ]
Tang, Zeyi [2 ]
Xiahou, Jianbing [3 ]
Lin, Pingyuan [3 ]
Xu, Xuexin [3 ]
Dai, Yuxing [3 ]
机构
[1] Maintenance Co State Grid Power Co Gansu Prov, Lanzhou 730000, Gansu, Peoples R China
[2] State Grid Infotelecom Great Power Sci & Technol, Fuzhou 350000, Peoples R China
[3] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China
关键词
Video generation; Variational autoencoder; Generative Adversarial Network; Refinement network;
D O I
10.1007/978-3-031-13870-6_42
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the development of deep learning technology, various techniques for image processing have emerged in the field of computer vision in recent years, and have excellent performance in a variety of application scenarios. In contrast to the prediction task of predicting video with multiple consecutive frames before and after the input to predict the missing images in the middle, the task of image-to-video generation proposed in this paper does not require multiple consecutive frames, but rather the directional content generation of images by inputting the first frame image with the embedding vector of motion features, and to address some of the existing problems, this paper innovates the network architecture to solve the generated video problems such as incoherence, frame loss and blurring. For multiple image-to-video translation tasks, we propose a VAE-RGAN network with a further refinement network. We add a refinement network and use new identity matching loss and connected feature matching loss to eliminate VAE and GAN's respective shortcomings and enhance the visual quality of the generated videos. Weizmann datasets have been the subject of a wide range of qualitative and quantitative experiments. We draw the following conclusions from this empirical study: (1) Compared with state-of-the-art approaches, our approach (VAE-RGAN) exhibits significant improvements in generative capability; (2) Experiments shows that our designed VAE-RGAN structure achieves better results and the refinement network significantly improves the problems of a blur.
引用
收藏
页码:494 / 505
页数:12
相关论文
共 50 条
  • [31] Image-to-image translation using an offset-basedmulti-scale codes GAN encoder
    Guo, Zihao
    Shao, Mingwen
    Li, Shunhang
    VISUAL COMPUTER, 2024, 40 (02): : 699 - 715
  • [32] Refinement of image quality in panoramic radiography using a generative adversarial network
    Kim, Hak-Sun
    Ha, Eun-Gyu
    Lee, Ari
    Choi, Yoon Joo
    Jeon, Kug Jin
    Han, Sang- Sun
    Lee, Chena
    DENTOMAXILLOFACIAL RADIOLOGY, 2023, 52 (05)
  • [33] GCT-VAE-GAN: An Image Enhancement Network for Low-Light Cattle Farm Scenes by Integrating Fusion Gate Transformation Mechanism and Variational Autoencoder GAN
    Wang, Chengchao
    Gao, Guohong
    Wang, Jianping
    Lv, Yingying
    Li, Qian
    Li, Zhiyu
    Zhang, Xueyan
    Wu, Haoyu
    IEEE ACCESS, 2023, 11 : 126650 - 126660
  • [34] MISS GAN: A Multi-IlluStrator style generative adversarial network for image to illustration translation
    Barzilay, Noa
    Shalev, Tal Berkovitz
    Giryes, Raja
    PATTERN RECOGNITION LETTERS, 2021, 151 : 140 - 147
  • [35] Pose image generation for video content creation using controlled human pose image generation GAN
    Kumar, Lalit
    Singh, Dushyant Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 59335 - 59354
  • [36] Image and Video Quality Assessment Using Neural Network and SVM
    丁文锐
    佟雨兵
    张其善
    杨东凯
    Tsinghua Science and Technology, 2008, (01) : 112 - 116
  • [37] LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation
    Deng, Zijun
    He, Xiangteng
    Peng, Yuxin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [38] Dynamic IR Drop Prediction Using Image-to-Image Translation Neural Network
    Kwon, Yonghwi
    Jung, Giyoon
    Hyun, Daijoon
    Shin, Youngsoo
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [39] Pothole Detection Using Image Enhancement GAN and Object Detection Network
    Salaudeen, Habeeb
    Celebi, Erbug
    ELECTRONICS, 2022, 11 (12)
  • [40] Image-to-image translation using an offset-based multi-scale codes GAN encoder
    Zihao Guo
    Mingwen Shao
    Shunhang Li
    The Visual Computer, 2024, 40 (2) : 699 - 715