共 50 条
- [21] Sound of Vision: Audio Generation from Visual Text Embedding through Training Domain Discriminator INTERSPEECH 2024, 2024, : 3305 - 3309
- [23] Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval 2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 913 - 917
- [24] Diffusion-Based Audio Inpainting JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2024, 72 (03): : 100 - 113
- [25] INVESTIGATING POOLING STRATEGIES AND LOSS FUNCTIONS FOR WEAKLY-SUPERVISED TEXT-TO-AUDIO GROUNDING VIA CONTRASTIVE LEARNING 2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
- [26] Open-Vocabulary Keyword Spotting With Audio And Text Embeddings INTERSPEECH 2019, 2019, : 3362 - 3366
- [28] LSB Based Audio Steganography Based On Text Compression INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND SYSTEM DESIGN 2011, 2012, 30 : 703 - 710
- [29] Automatic generation of audio content for open learning resources JOURNAL OF INTERACTIVE MEDIA IN EDUCATION, 2009, (01):
- [30] TAVT:Towards Transferable Audio-Visual Text Generation PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14983 - 14999