Music Conditioned Generation for Human-Centric Video

被引:0
|
作者
Zhao, Zimeng [1 ]
Zuo, Binghui [1 ]
Wang, Yangang [1 ]
机构
[1] Southeast Univ, Sch Automat, Key Lab Measurement & Control Complex Syst Engn, Minist Educ, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Multiple signal classification; Generative adversarial networks; Correlation; Visualization; Training; Task analysis; Feature extraction; Video generation; signal processing; cross-modal learning; human-centric;
D O I
10.1109/LSP.2024.3358978
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Music and human-centric video are two fundamental signals across languages. Correlation analysis between the two is currently used in choreography and film accompaniment. This letter explores this correlation in a new task: human-centric video generation from a start-end image pair and transitional music. Existing human-centric generation methods are not competent for this task because they require frame-wise pose as input or have difficulty handling long-duration videos. Our key idea is to build a temporal generation framework dominated by DDPM and assisted by VAE and GAN. It reduces the computational cost of music-image diffusion by utilizing the latent space compactness of VAE and the image translation efficiency of GAN. To produce videos with both long duration and high quality, our framework first generates small-scale keyframes and then generates high-resolution videos. To strengthen the frame-wise consistency of the human body, a frame-aligned correspondence map is adopted as an intermediate supervision. Extensive experiments compared with the SOTA method have demonstrated the rationality and effectiveness of this signal generation framework.
引用
收藏
页码:506 / 510
页数:5
相关论文
共 50 条
  • [21] TOWARD A HUMAN-CENTRIC INTERNET
    West, Jessamyn
    LIBRARY JOURNAL, 2010, 135 (02) : 24 - 25
  • [22] Human-Centric Image Captioning
    Yang, Zuopeng
    Wang, Pengbo
    Chu, Tianshu
    Yang, Jie
    Pattern Recognition, 2022, 126
  • [24] Human-Centric Image Captioning
    Yang, Zuopeng
    Wang, Pengbo
    Chu, Tianshu
    Yang, Jie
    PATTERN RECOGNITION, 2022, 126
  • [25] A synthetic human-centric dataset generation pipeline for active robotic vision
    Georgiadis, Charalampos
    Passalis, Nikolaos
    Nikolaidis, Nikos
    PATTERN RECOGNITION LETTERS, 2024, 179 : 17 - 23
  • [26] Industry 5 and the Human in Human-Centric Manufacturing
    Briken, Kendra
    Moore, Jed
    Scholarios, Dora
    Rose, Emily
    Sherlock, Andrew
    SENSORS, 2023, 23 (14)
  • [27] Towards TRUE human-centric computation
    Rabaey, Jan M.
    COMPUTER COMMUNICATIONS, 2018, 131 : 73 - 76
  • [28] Data augmentation in human-centric vision
    Wentao Jiang
    Yige Zhang
    Shaozhong Zheng
    Si Liu
    Shuicheng Yan
    Vicinagearth, 1 (1):
  • [29] Virtual workshops for human-centric computing
    Costabile, MF
    Fogli, D
    Fresta, G
    Mussio, P
    Piccinno, A
    2004 IEEE SYMPOSIUM ON VISUAL LANGUAGES AND HUMAN CENTRIC COMPUTING: PROCEEDINGS, 2004, : 65 - 68
  • [30] Prospect Theory for Human-Centric Communications
    Luo, Kevin
    Dang, Shuping
    Shihada, Basem
    Alouini, Mohamed-Slim
    FRONTIERS IN COMMUNICATIONS AND NETWORKS, 2021, 2