共 50 条
- [1] Text-to-Audio Generation using Instruction-Guided Latent Diffusion Model PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3590 - 3598
- [2] LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation INTERSPEECH 2024, 2024, : 4813 - 4817
- [3] RETRIEVAL-AUGMENTED TEXT-TO-AUDIO GENERATION 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 581 - 585
- [5] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 7151 - 7161
- [7] GENERATION OR REPLICATION: AUSCULTATING AUDIO LATENT DIFFUSION MODELS 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1156 - 1160
- [8] TEXT-TO-AUDIO GROUNDING: BUILDING CORRESPONDENCE BETWEEN CAPTIONS AND SOUND EVENTS 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 606 - 610
- [9] Speakmytext: A Platform To Support Crowd-Sourced Text-To-Audio Translations PROCEEDINGS OF THE FIRST AFRICAN CONFERENCE FOR HUMAN COMPUTER INTERACTION (AFRICHI'16), 2016, : 160 - 164
- [10] BATON: Aligning Text-to-Audio Model Using Human Preference Feedback PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 4542 - 4550