SVFX: a machine learning framework to quantify the pathogenicity of structural variants

被引:21
|
作者
Kumar, Sushant [1 ,2 ]
Harmanci, Arif [3 ]
Vytheeswaran, Jagath [4 ]
Gerstein, Mark B. [1 ,2 ,5 ]
机构
[1] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[2] Yale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
[3] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Ctr Precis Hlth, Houston, TX 77030 USA
[4] CALTECH, Dept Comp & Math Sci, Pasadena, CA 91125 USA
[5] Yale Univ, Dept Comp Sci, 260-266 Whitney Ave,POB 208114, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
IMPACT; SETD3; MUTATIONS;
D O I
10.1186/s13059-020-02178-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Sushant Kumar
    Arif Harmanci
    Jagath Vytheeswaran
    Mark B. Gerstein
    Genome Biology, 21
  • [2] LYRUS: a machine learning model for predicting the pathogenicity of missense variants
    Lai, Jiaying
    Yang, Jordan
    Gamsiz Uzun, Ece D.
    Rubenstein, Brenda M.
    Sarkar, Indra Neil
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [3] Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies
    Wu, Tzu-Hsuan
    Lin, Peng-Chan
    Chou, Hsin-Hung
    Shen, Meng-Ru
    Hsieh, Sun-Yuan
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (01) : 606 - 615
  • [4] DTreePred: an online viewer based on machine learning for pathogenicity prediction of genomic variants
    Daniel Henrique Ferreira Gomes
    Inácio Gomes Medeiros
    Tirzah Braz Petta
    Beatriz Stransky
    Jorge Estefano Santana de Souza
    BMC Bioinformatics, 26 (1)
  • [5] StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants
    Sharo, Andrew G.
    Hu, Zhiqiang
    Sunyaev, Shamil R.
    Brenner, Steven E.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (02) : 195 - 209
  • [6] A Machine-Learning Framework to Quantify Postprandial Glucose Responses in Gestational Diabetes
    Barua, Souptik
    Upadhyay, Dhairya A.
    Sangmo, Tenzin
    Khan, Arsala
    Berube, Lauren
    Li, Ling-Jun
    Williams, Shauna
    Rosen, Todd
    Rawal, Shristi
    DIABETES, 2024, 73
  • [7] MLb-LDLr A Machine Learning Model for Predicting the Pathogenicity of LDLr Missense Variants
    Larrea-Sebal, Asier
    Benito-Vicente, Asier
    Fernandez-Higuero, Jose A.
    Jebari-Benslaiman, Shifa
    Galicia-Garcia, Unai
    Uribe, Kepa B.
    Cenarro, Ana
    Ostolaza, Helena
    Civeira, Fernando
    Arrasate, Sonia
    Gonzalez-Diaz, Humberto
    Martin, Cesar
    JACC-BASIC TO TRANSLATIONAL SCIENCE, 2021, 6 (11): : 815 - 827
  • [8] A Novel Machine Learning Based in silico Pathogenicity Predictor for Missense Variants in a Hematological Setting
    Hutter, Stephan
    Baer, Constance
    Walter, Wencke
    Kern, Wolfgang
    Haferlach, Claudia
    Haferlach, Torsten
    BLOOD, 2019, 134
  • [9] Advancing Prediction of Pathogenicity of Familial Hypercholesterolemia LDL Receptor Commonest Variants With Machine Learning Models
    Santos, Raul D.
    JACC-BASIC TO TRANSLATIONAL SCIENCE, 2021, 6 (11): : 828 - 830
  • [10] Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
    Khandakji, Mohannad N. N.
    Mifsud, Borbala
    FRONTIERS IN GENETICS, 2022, 13