Image-to-Video Translation Using a VAE-GAN with Refinement Network

被引:0
|
作者
Wang, Shengli [1 ]
Xieshi, Mulin [2 ]
Zhou, Zhangpeng [1 ]
Zhang, Xiang [2 ]
Liu, Xujie [2 ]
Tang, Zeyi [2 ]
Xiahou, Jianbing [3 ]
Lin, Pingyuan [3 ]
Xu, Xuexin [3 ]
Dai, Yuxing [3 ]
机构
[1] Maintenance Co State Grid Power Co Gansu Prov, Lanzhou 730000, Gansu, Peoples R China
[2] State Grid Infotelecom Great Power Sci & Technol, Fuzhou 350000, Peoples R China
[3] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China
关键词
Video generation; Variational autoencoder; Generative Adversarial Network; Refinement network;
D O I
10.1007/978-3-031-13870-6_42
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the development of deep learning technology, various techniques for image processing have emerged in the field of computer vision in recent years, and have excellent performance in a variety of application scenarios. In contrast to the prediction task of predicting video with multiple consecutive frames before and after the input to predict the missing images in the middle, the task of image-to-video generation proposed in this paper does not require multiple consecutive frames, but rather the directional content generation of images by inputting the first frame image with the embedding vector of motion features, and to address some of the existing problems, this paper innovates the network architecture to solve the generated video problems such as incoherence, frame loss and blurring. For multiple image-to-video translation tasks, we propose a VAE-RGAN network with a further refinement network. We add a refinement network and use new identity matching loss and connected feature matching loss to eliminate VAE and GAN's respective shortcomings and enhance the visual quality of the generated videos. Weizmann datasets have been the subject of a wide range of qualitative and quantitative experiments. We draw the following conclusions from this empirical study: (1) Compared with state-of-the-art approaches, our approach (VAE-RGAN) exhibits significant improvements in generative capability; (2) Experiments shows that our designed VAE-RGAN structure achieves better results and the refinement network significantly improves the problems of a blur.
引用
收藏
页码:494 / 505
页数:12
相关论文
共 50 条
  • [41] Video anomaly detection using deep residual-spatiotemporal translation network
    Ganokratanaa, Thittaporn
    Aramvith, Supavadee
    Sebe, Nicu
    PATTERN RECOGNITION LETTERS, 2022, 155 : 143 - 150
  • [42] URCA-GAN: UpSample Residual Channel-wise Attention Generative Adversarial Network for image-to-image translation
    Nie, Xuan
    Ding, Haoxuan
    Qi, Manhua
    Wang, Yifei
    Wong, Edward K.
    NEUROCOMPUTING, 2021, 443 : 75 - 84
  • [43] I2V-CMGAN: Generative Adversarial Cross-Modal Network-Based Image-to-Video Person Re-identification
    Joshi, Aditya
    Diwakar, Manoj
    COGNITIVE COMPUTATION, 2025, 17 (01)
  • [44] Image-to-video person re-identification using three-dimensional semantic appearance alignment and cross-modal interactive learning
    Shi, Wei
    Liu, Hong
    Liu, Mengyuan
    PATTERN RECOGNITION, 2022, 122
  • [45] CD-GAN: Commonsense-Driven Generative Adversarial Network with Hierarchical Refinement for Text-to-Image Synthesis
    Zhang, Guokai
    Xu, Ning
    Yan, Chenggang
    Zheng, Bolun
    Duan, Yulong
    Lv, Bo
    Liu, An-An
    Intelligent Computing, 2023, 2
  • [46] Image and Video Super Resolution using Recurrent Generative Adversarial Network
    Thawakar, Omkar
    Patil, Prashant W.
    Dudhane, Akshay
    Murala, Subrahmanyam
    Kulkarni, Uday
    2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,
  • [47] PROGRESSIVE REFINEMENT: A METHOD OF COARSE-TO-FINE IMAGE PARSING USING STACKED NETWORK
    Hu, Jiagao
    Sun, Zhengxing
    Sun, Yunhan
    Shi, Jinlong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [48] CIT-GAN: Cyclic Image Translation Generative Adversarial Network With Application in Iris Presentation Attack Detection
    Yadav, Shivangi
    Ross, Arun
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2411 - 2420
  • [49] Sonar Image Translation Using Generative Adversarial Network for Underwater Object Recognition
    Sung, Minsung
    Cho, Hyeonwoo
    Kim, Jason
    Yu, Son-Cheol
    2019 IEEE UNDERWATER TECHNOLOGY (UT), 2019,
  • [50] Video super resolution using convolutional neural network and image fusion techniques
    Kumar, Vikas
    Choudhury, Tanupriya
    Satapathy, Suresh Chandra
    Tomar, Ravi
    Aggarwal, Archit
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2020, 24 (04) : 279 - 287