Latent Diffusion for Language Generation

被引:0
|
作者
Lovelace, Justin [1 ]
Kishore, Varsha [1 ]
Wan, Chao [1 ]
Shekhtman, Eliot [1 ]
Weinberger, Kilian Q. [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. Our code is available at https://github.com/justinlovelace/latent-diffusion-for-language.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Conditional Image-to-Video Generation with Latent Flow Diffusion Models
    Ni, Haomiao
    Shi, Changhao
    Li, Kai
    Huang, Sharon X.
    Min, Martin Renqiang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18444 - 18455
  • [22] LaMD: Latent Motion Diffusion for Image-Conditional Video Generation
    Hu, Yaosi
    Chen, Zhenzhong
    Luo, Chong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [23] NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
    Kim, Seung Wook
    Brown, Bradley
    Yin, Kangxue
    Kreis, Karsten
    Schwarz, Kalja.
    Li, Daiging
    Rombach, Robin
    Torralba, Antonio
    Fidler, Sanja
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8496 - 8506
  • [24] GLDM: hit molecule generation with constrained graph latent diffusion model
    Wang, Conghao
    Ong, Hiok Hian
    Chiba, Shunsuke
    Rajapakse, Jagath C.
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [25] Controllable Mesh Generation Through Sparse Latent Point Diffusion Models
    Lyu, Zhaoyang
    Wang, Jinyi
    An, Yuwei
    Zhang, Ya
    Lin, Dahua
    Dai, Bo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 271 - 280
  • [26] Decomposed Latent Diffusion Model for 3D Point Cloud Generation
    Zhao, Runfeng
    Ji, Junzhong
    Lei, Minglong
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI, 2025, 15036 : 431 - 445
  • [27] Leveraging the Latent Diffusion Models for Offline Facial Multiple Appropriate Reactions Generation
    Yu, Jun
    Zhao, Ji
    Xie, Guochen
    Chen, Fengxin
    Yu, Ye
    Peng, Liang
    Li, Minglei
    Dai, Zonghong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 9561 - 9565
  • [28] CCLAP: CONTROLLABLE CHINESE LANDSCAPE PAINTING GENERATION VIA LATENT DIFFUSION MODEL
    Wang, Zhongqi
    Zhang, Jie
    Ji, Zhilong
    Bai, Jinfeng
    Shan, Shiguang
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2117 - 2122
  • [29] LION: Latent Point Diffusion Models for 3D Shape Generation
    Zeng, Xiaohui
    Vahdat, Arash
    Williams, Francis
    Gojcic, Zan
    Litany, Or
    Fidler, Sanja
    Kreis, Karsten
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [30] FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
    Zhou, Chenliang
    Zhong, Fangcheng
    Hanji, Param
    Guo, Zhilin
    Fogarty, Kyle
    Sztrajman, Alejandro
    Hongyun
    Oztireli, Cengiz
    COMPUTER VISION - ECCV 2024, PT LXVII, 2025, 15125 : 434 - 453