SuPreMo: a computational tool for streamlining in silico perturbation using sequence-based predictive models

被引:1
|
作者
Gjoni, Ketrin [1 ,2 ]
Pollard, Katherine S. [1 ,2 ,3 ]
机构
[1] Gladstone Inst, Inst Data Sci & Biotechnol, 1650 Owens St, San Francisco, CA 94158 USA
[2] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA 94158 USA
[3] Chan Zuckerberg Biohub, San Francisco, CA 94158 USA
基金
美国国家卫生研究院;
关键词
VARIANTS; GENOME;
D O I
10.1093/bioinformatics/btae340
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The increasing development of sequence-based machine learning models has raised the demand for manipulating sequences for this application. However, existing approaches to edit and evaluate genome sequences using models have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo), a scalable and comprehensive tool for performing and supporting in silico mutagenesis experiments. We then demonstrate how pairs of reference and perturbed sequences can be used with machine learning models to prioritize pathogenic variants or discover new functional sequences.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Generalized Sequence-Based and Reverse Sequence-Based Models for Broadcasting Hot Videos
    Yu, Hsiang-Fu
    Ho, Pin-Han
    Yang, Hung-Chang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (01) : 152 - 165
  • [2] A sequence-based computational method for prediction of MoRFs
    Wang, Yu
    Guo, Yanzhi
    Pu, Xuemei
    Li, Menglong
    RSC ADVANCES, 2017, 7 (31) : 18937 - 18945
  • [3] Sequence-based predictive modeling to identify cancerlectins
    Lai, Hong-Yan
    Chen, Xin-Xin
    Chen, Wei
    Tang, Hua
    Lin, Hao
    ONCOTARGET, 2017, 8 (17) : 28169 - 28175
  • [4] Sequence-based protein superfamily classification using computational intelligence techniques: a review
    Vipsita, Swati
    Rath, Santanu Kumar
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (04) : 424 - 457
  • [5] A Comprehensive Comparative Review of Protein Sequence-Based Computational Prediction Models of Lysine Succinylation Sites
    Tasmia, Samme Amena
    Kibria, Md. Kaderi
    Islam, Md. Ariful
    Khatun, Mst Shamima
    Mollah, Md. Nurul Haque
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2022, 23 (11) : 744 - 756
  • [6] IACP: a sequence-based tool for identifying anticancer peptides
    Chen, Wei
    Ding, Hui
    Feng, Pengmian
    Lin, Hao
    Chou, Kuo-Chen
    ONCOTARGET, 2016, 7 (13) : 16895 - 16909
  • [7] Quality assessment and refinement of chromatin accessibility data using a sequence-based predictive model
    Han, Seong Kyu
    Muto, Yoshiharu
    Wilson, Parker C.
    Humphreys, Benjamin D.
    Sampson, Matthew G.
    Chakravarti, Aravinda
    Lee, Dongwo
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (51)
  • [8] Combination of sequence-based and in silico screening to identify novel trehalose synthases
    Cai, Xue
    Seitl, Ines
    Mu, Wanmeng
    Zhang, Tao
    Stressler, Timo
    Fischer, Lutz
    Jiang, Bo
    ENZYME AND MICROBIAL TECHNOLOGY, 2018, 115 : 62 - 72
  • [9] IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models
    Liu, Xinyi
    Shen, Yueyue
    Zhang, Youhua
    Liu, Fei
    Ma, Zhiyu
    Yue, Zhenyu
    Yue, Yi
    PEERJ, 2021, 9
  • [10] Predictive Switching Sequence-based Control for Constant Power Load
    Chatterjee, Debanjan
    Mazumder, Sudip K.
    2019 IEEE ENERGY CONVERSION CONGRESS AND EXPOSITION (ECCE), 2019, : 1574 - 1583