Latent generative landscapes as maps of functional diversity in protein sequence space

被引:0
|
作者
Cheyenne Ziegler
Jonathan Martin
Claude Sinner
Faruck Morcos
机构
[1] University of Texas at Dallas,Department of Biological Sciences
[2] University of Texas at Dallas,Department of Bioengineering
[3] University of Texas at Dallas,Center for Systems Biology
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Variational autoencoders are unsupervised learning models with generative capabilities, when applied to protein data, they classify sequences by phylogeny and generate de novo sequences which preserve statistical properties of protein composition. While previous studies focus on clustering and generative features, here, we evaluate the underlying latent manifold in which sequence information is embedded. To investigate properties of the latent manifold, we utilize direct coupling analysis and a Potts Hamiltonian model to construct a latent generative landscape. We showcase how this landscape captures phylogenetic groupings, functional and fitness properties of several systems including Globins, β-lactamases, ion channels, and transcription factors. We provide support on how the landscape helps us understand the effects of sequence variability observed in experimental data and provides insights on directed and natural protein evolution. We propose that combining generative properties and functional predictive power of variational autoencoders and coevolutionary analysis could be beneficial in applications for protein engineering and design.
引用
收藏
相关论文
共 50 条
  • [21] Score-based Generative Modeling in Latent Space
    Vahdat, Arash
    Kreis, Karsten
    Kautz, Jan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [22] ClusterGAN: Latent Space Clustering in Generative Adversarial Networks
    Mukherjee, Sudipto
    Asnani, Himanshu
    Lin, Eugene
    Kannan, Sreeram
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4610 - 4617
  • [23] Generative geomodeling based on flow responses in latent space
    Jo, Suryeom
    Ahn, Seongin
    Park, Changhyup
    Kim, Jaejun
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2022, 211
  • [24] Evolutionary Latent Space Exploration of Generative Adversarial Networks
    Fernandes, Paulo
    Correia, Joao
    Machado, Penousal
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2020, 2020, 12104 : 595 - 609
  • [25] A Latent Space Understandable Generative Adversarial Network: SelfExGAN
    Liu, Yongjie
    Wang, Qianlong
    Gu, Yanlei
    Kamijo, Shunsuke
    2017 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING - TECHNIQUES AND APPLICATIONS (DICTA), 2017, : 353 - 360
  • [26] Space is a latent sequence: A theory of the hippocampus
    Raju, Rajkumar Vasudeva
    Guntupalli, J. Swaroop
    Zhou, Guangyao
    Wendelken, Carter
    Lazaro-Gredilla, Miguel
    George, Dileep
    SCIENCE ADVANCES, 2024, 10 (31):
  • [27] Double Diffusion Maps and their Latent Harmonics for scientific computations in latent space
    Evangelou, Nikolaos
    Dietrich, Felix
    Chiavazzo, Eliodoro
    Lehmberg, Daniel
    Meila, Marina
    Kevrekidis, Ioannis G.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2023, 485
  • [28] Chaos of a sequence of maps in a metric space
    Tian, CJ
    Chen, GR
    CHAOS SOLITONS & FRACTALS, 2006, 28 (04) : 1067 - 1075
  • [29] Exploring protein sequence–function landscapes
    Tyler N Starr
    Joseph W Thornton
    Nature Biotechnology, 2017, 35 : 125 - 126
  • [30] Diversity and functional landscapes in the microbiota of animals in the wild
    Levin, Doron
    Raab, Neta
    Pinto, Yishay
    Rothschild, Daphna
    Zanir, Gal
    Godneva, Anastasia
    Mellul, Nadav
    Futorian, David
    Gal, Doran
    Leviatan, Sigal
    Zeevi, David
    Bachelet, Ido
    Segal, Eran
    SCIENCE, 2021, 372 (6539) : 254 - +