Latent Diffusion for Language Generation

被引:0
|
作者
Lovelace, Justin [1 ]
Kishore, Varsha [1 ]
Wan, Chao [1 ]
Shekhtman, Eliot [1 ]
Weinberger, Kilian Q. [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. Our code is available at https://github.com/justinlovelace/latent-diffusion-for-language.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
    Yaohui Wang
    Xinyuan Chen
    Xin Ma
    Shangchen Zhou
    Ziqi Huang
    Yi Wang
    Ceyuan Yang
    Yinan He
    Jiashuo Yu
    Peiqing Yang
    Yuwei Guo
    Tianxing Wu
    Chenyang Si
    Yuming Jiang
    Cunjian Chen
    Chen Change Loy
    Bo Dai
    Dahua Lin
    Yu Qiao
    Ziwei Liu
    International Journal of Computer Vision, 2025, 133 (5) : 3059 - 3078
  • [32] Controllable Human Trajectory Generation Using Profile-Guided Latent Diffusion
    Song, Yiwen
    Ding, Jingtao
    Yuan, Jian
    Liao, Qingmin
    Li, Yong
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2025, 19 (01)
  • [33] Text-Guided Molecule Generation with Diffusion Language Model
    Gong, Haisong
    Liu, Qiang
    Wu, Shu
    Wang, Liang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 109 - 117
  • [34] Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos
    Zhang, Chaoyang
    Hua, Yan
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [35] Blended Latent Diffusion
    Avrahami, Omri
    Fried, Ohad
    Lischinski, Dani
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
  • [36] Binary Latent Diffusion
    Wang, Ze
    Wang, Jiang
    Liu, Zicheng
    Qiu, Qiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22576 - 22585
  • [37] LATENT LANGUAGE AND SCHIZOPHRENIA
    SHAVE, D
    AMERICAN JOURNAL OF PSYCHOTHERAPY, 1965, 19 (01) : 29 - 39
  • [38] Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality
    Li, Chenyang
    Zhang, Long
    Zheng, Qiusheng
    ELECTRONICS, 2024, 13 (06)
  • [39] Realistic Tumor generation using 3D conditional latent diffusion model
    Hu, Rui
    Yoon, Siyeop
    Wu, Dufan
    Tivnan, Matthew
    Chen, Zhennong
    Wang, Yuang
    Luo, Jie
    Cui, Jianan
    Li, Quanzheng
    Liu, Huafeng
    Guo, Ning
    JOURNAL OF NUCLEAR MEDICINE, 2024, 65
  • [40] Text-to-Audio Generation using Instruction-Guided Latent Diffusion Model
    Ghosal, Deepanway
    Majumder, Navonil
    Mehrish, Ambuj
    Poria, Soujanya
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3590 - 3598