Latent Diffusion for Language Generation

被引:0
|
作者
Lovelace, Justin [1 ]
Kishore, Varsha [1 ]
Wan, Chao [1 ]
Shekhtman, Eliot [1 ]
Weinberger, Kilian Q. [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. Our code is available at https://github.com/justinlovelace/latent-diffusion-for-language.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] Classifiers Guided Controllable Text Generation for Discrete Diffusion Language Models
    Jiang, Hang
    Cai, Guoyong
    Li, Sihui
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 132 - 144
  • [42] DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation
    Wang, Chenyang
    Zheng, Zerong
    Yu, Tao
    Lv, Xiaoqian
    Zhong, Bineng
    Zhang, Shengping
    Nie, Liqiang
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6169 - 6179
  • [43] AUDIO-JOURNEY: OPEN DOMAIN LATENT DIFFUSION BASED TEXT-TO-AUDIO GENERATION
    Michaels, Jackson
    Li, Juncheng B.
    Yao, Laura
    Yu, Lijun
    Wood-Doughty, Zach
    Metze, Florian
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6960 - 6964
  • [44] Short-Term Wind Power Scenario Generation Based on Conditional Latent Diffusion Models
    Dong, Xiaochong
    Mao, Zhihang
    Sun, Yingyun
    Xu, Xinzhi
    IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2024, 15 (02) : 1074 - 1085
  • [45] Latent-Conditioned Equivariant Diffusion for Structure-Based De Novo Ligand Generation
    Cremer, Julian
    Le, Tuan
    Clevert, Djork-Arne
    Schuett, Kristof T.
    AI IN DRUG DISCOVERY, AIDD 2024, 2025, 14894 : 36 - 46
  • [46] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
    Xing, Yazhou
    He, Yingqing
    Tian, Zeyue
    Wang, Xintao
    Chen, Qifeng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 7151 - 7161
  • [47] SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation
    Koo, Juil
    Yoo, Seungwoo
    Nguyen, Minh Hieu
    Sung, Minhyuk
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14395 - 14405
  • [48] Artificial intelligence using a latent diffusion model enables the generation of diverse and potent antimicrobial peptides
    Wang, Yeji
    Song, Minghui
    Liu, Fujing
    Liang, Zhen
    Hong, Rui
    Dong, Yuemei
    Luan, Huaizu
    Fu, Xiaojie
    Yuan, Wenchang
    Fang, Wenjie
    Li, Gang
    Lou, Hongxiang
    Chang, Wenqiang
    SCIENCE ADVANCES, 2025, 11 (06):
  • [49] Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation
    Li, Hang
    Shen, Chengzhi
    Torre, Philip
    Tresp, Volker
    Guo, Jindong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12006 - 12016
  • [50] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
    Zhang, David Junhao
    Wu, Jay Zhangjie
    Liu, Jia-Wei
    Zhao, Rui
    Ran, Lingmin
    Gu, Yuchao
    Gao, Difei
    Shou, Mike Zheng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 1879 - 1893