Latent generative landscapes as maps of functional diversity in protein sequence space

被引:0
|
作者
Cheyenne Ziegler
Jonathan Martin
Claude Sinner
Faruck Morcos
机构
[1] University of Texas at Dallas,Department of Biological Sciences
[2] University of Texas at Dallas,Department of Bioengineering
[3] University of Texas at Dallas,Center for Systems Biology
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Variational autoencoders are unsupervised learning models with generative capabilities, when applied to protein data, they classify sequences by phylogeny and generate de novo sequences which preserve statistical properties of protein composition. While previous studies focus on clustering and generative features, here, we evaluate the underlying latent manifold in which sequence information is embedded. To investigate properties of the latent manifold, we utilize direct coupling analysis and a Potts Hamiltonian model to construct a latent generative landscape. We showcase how this landscape captures phylogenetic groupings, functional and fitness properties of several systems including Globins, β-lactamases, ion channels, and transcription factors. We provide support on how the landscape helps us understand the effects of sequence variability observed in experimental data and provides insights on directed and natural protein evolution. We propose that combining generative properties and functional predictive power of variational autoencoders and coevolutionary analysis could be beneficial in applications for protein engineering and design.
引用
收藏
相关论文
共 50 条
  • [31] Assessing Sample Quality via the Latent Space of Generative Models
    Xu, Jingyi
    Le, Hieu
    Samaras, Dimitris
    COMPUTER VISION - ECCV 2024, PT LIX, 2025, 15117 : 449 - 464
  • [32] GINT: A Generative Interpretability method via perturbation in the latent space
    Tang, Caizhi
    Cui, Qing
    Li, Longfei
    Zhou, Jun
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232
  • [33] WL-GAN: Learning to sample in generative latent space
    Hou, Zeyi
    Lang, Ning
    Zhou, Xiuzhuang
    INFORMATION SCIENCES, 2025, 700
  • [34] Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network
    Fontaine, Matthew C.
    Liu, Ruilin
    Khalifa, Ahmed
    Modi, Jignesh
    Togelius, Julian
    Hoover, Amy K.
    Nikolaidis, Stefanos
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 5922 - 5930
  • [35] Desirable molecule discovery via generative latent space exploration
    Zheng, Wanjie
    Li, Jie
    Zhang, Yang
    VISUAL INFORMATICS, 2023, 7 (04) : 13 - 21
  • [36] Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks
    Qiu, Yixuan
    Gao, Qingyi
    Wang, Xiao
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [37] Reinforcement Learning in Latent Action Sequence Space
    Kim, Heecheol
    Yamada, Masanori
    Miyoshi, Kosuke
    Iwata, Tomoharu
    Yamakawa, Hiroshi
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5497 - 5503
  • [38] Multistate and functional protein design using RoseTTAFold sequence space diffusion
    Lisanza, Sidney Lyayuga
    Gershon, Jacob Merle
    Tipps, Samuel W. K.
    Sims, Jeremiah Nelson
    Arnoldt, Lucas
    Hendel, Samuel J.
    Simma, Miriam K.
    Liu, Ge
    Yase, Muna
    Wu, Hongwei
    Tharp, Claire D.
    Li, Xinting
    Kang, Alex
    Brackenbrough, Evans
    Bera, Asim K.
    Gerben, Stacey
    Wittmann, Bruce J.
    McShan, Andrew C.
    Baker, David
    NATURE BIOTECHNOLOGY, 2024,
  • [39] The generative capacity of probabilistic protein sequence models
    McGee, Francisco
    Hauri, Sandro
    Novinger, Quentin
    Vucetic, Slobodan
    Levy, Ronald M.
    Carnevale, Vincenzo
    Haldane, Allan
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [40] Protein sequence design with deep generative models
    Wu, Zachary
    Johnston, Kadina E.
    Arnold, Frances H.
    Yang, Kevin K.
    CURRENT OPINION IN CHEMICAL BIOLOGY, 2021, 65 : 18 - 27