SVFX: a machine learning framework to quantify the pathogenicity of structural variants

被引:21
|
作者
Kumar, Sushant [1 ,2 ]
Harmanci, Arif [3 ]
Vytheeswaran, Jagath [4 ]
Gerstein, Mark B. [1 ,2 ,5 ]
机构
[1] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[2] Yale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
[3] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Ctr Precis Hlth, Houston, TX 77030 USA
[4] CALTECH, Dept Comp & Math Sci, Pasadena, CA 91125 USA
[5] Yale Univ, Dept Comp Sci, 260-266 Whitney Ave,POB 208114, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
IMPACT; SETD3; MUTATIONS;
D O I
10.1186/s13059-020-02178-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] ALGORITHMS FOR SUPERVISED MACHINE LEARNING- BASED STRUCTURAL PERFORMANCE EVALUATION FRAMEWORK
    Wang, Xiaowei
    Heo, YeongAe
    PROCEEDINGS OF THE ASME 39TH INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, OMAE2020, VOL 2A, 2020,
  • [32] Unsupervised machine learning framework for discriminating major variants of concern during COVID-19
    Chandra, Rohitash
    Bansal, Chaarvi
    Kang, Mingyue
    Blau, Tom
    Agarwal, Vinti
    Singh, Pranjal
    Wilson, Laurence O. W.
    Vasan, Seshadri
    PLOS ONE, 2023, 18 (05):
  • [33] ETHICAL FRAMEWORK FOR MACHINE LEARNING
    Malhotra, Charru
    Kotwal, Vinod
    Dalal, Surabhi
    2018 ITU KALEIDOSCOPE: MACHINE LEARNING FOR A 5G FUTURE (ITU K), 2018,
  • [34] A Distributed Machine Learning Framework
    Alpcan, Tansu
    Bauckhage, Christian
    PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 2546 - 2551
  • [35] A new framework for machine learning
    Bishop, Christopher M.
    COMPUTATIONAL INTELLIGENCE: RESEARCH FRONTIERS, 2008, 5050 : 1 - 24
  • [36] SVPath: an accurate pipeline for predicting the pathogenicity of human exon structural variants
    Yang, Yaning
    Wang, Xiaoqi
    Zhou, Deshan
    Wei, Dong-Qing
    Peng, Shaoliang
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [37] SIGMA leverages protein structural information to predict the pathogenicity of missense variants
    Zhao, Hengqiang
    Du, Huakang
    Zhao, Sen
    Chen, Zefu
    Li, Yaqi
    Xu, Kexin
    Liu, Bowen
    Cheng, Xi
    Wen, Wen
    Li, Guozhuang
    Chen, Guilin
    Zhao, Zhengye
    Qiu, Guixing
    Liu, Pengfei
    Zhang, Terry Jianguo
    Wu, Zhihong
    Wu, Nan
    CELL REPORTS METHODS, 2024, 4 (01):
  • [38] Machine-Learning Techniques Classify, Quantify Cuttings Lithology
    Nanjo, Takashi
    Ebitani, Akira
    Ishikawa, Kazuaki
    JPT, Journal of Petroleum Technology, 2024, 76 (01): : 92 - 94
  • [39] Using Machine Learning To Quantify Transverse Plane Lumbopelvic Rhythm
    Higgins, Seth
    Tome, Joshu M.
    Kakar, Rumit Singh
    MEDICINE AND SCIENCE IN SPORTS AND EXERCISE, 2021, 53 (08): : 165 - 165
  • [40] Using Machine Learning to Quantify the Multimedia Risk Due to Fuzzing
    Kashyap, Gautam Siddharth
    Malik, Karan
    Wazir, Samar
    Khan, Rijwan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 36685 - 36698