共 21 条
- [1] MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 431 - 449
- [2] Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6639 - 6647
- [3] Text2Video: Automatic Video Generation Based on Text Scripts PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2753 - 2755
- [4] Automated generation of news content hierarchy by integrating audio, video, and text information ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 3025 - 3028
- [6] Learning Universal Policies via Text-Guided Video Generation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [7] Text-to-Audio Generation using Instruction-Guided Latent Diffusion Model PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3590 - 3598
- [8] Text2Performer: Text-Driven Human Video Generation 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22690 - 22700
- [9] Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10681 - 10692
- [10] ED-T2V: An Efficient Training Framework for Diffusion-based Text-to-Video Generation 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,