NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer

被引:31
|
作者
Anzar, Irantzu [1 ]
Sverchkova, Angelina [1 ]
Stratford, Richard [1 ]
Clancy, Trevor [1 ]
机构
[1] OncoImmunity AS, Oslo Canc Cluster, Ullernchausseen 64-66, N-0379 Oslo, Norway
关键词
Somatic variant detection; Machine learning; Cancer genomics; Precision medicine; POINT MUTATIONS; IDENTIFICATION; ALGORITHMS; DISCOVERY; VARIANTS; PIPELINE;
D O I
10.1186/s12920-019-0508-5
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
BackgroundThe accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity.MethodsIn light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples.ResultsA robust and exhaustive evaluation of NeoMutate's performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools.ConclusionsWe show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer
    Irantzu Anzar
    Angelina Sverchkova
    Richard Stratford
    Trevor Clancy
    BMC Medical Genomics, 12
  • [2] Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods
    He, Zongzhen
    Zhang, Junying
    Yuan, Xiguo
    Zhang, Yuanyuan
    FRONTIERS IN GENETICS, 2021, 11
  • [3] Efficient RIEV: a novel framework for the prediction of breast cancer cases using ensemble machine learning
    Sharma, Akriti
    Hooda, Nishtha
    Gupta, Nidhi Rani
    Sharma, Renu
    NETWORK MODELING AND ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS, 2023, 12 (01):
  • [4] Efficient RIEV: a novel framework for the prediction of breast cancer cases using ensemble machine learning
    Akriti Sharma
    Nishtha Hooda
    Nidhi Rani Gupta
    Renu Sharma
    Network Modeling Analysis in Health Informatics and Bioinformatics, 12
  • [5] SMuRF: portable and accurate ensemble prediction of somatic mutations
    Huan, Weitai
    Guo, Yu Amanda
    Muthukumar, Karthik
    Baruah, Probhonjon
    Chang, Mei Mei
    Skanderup, Anders Jacobsen
    BIOINFORMATICS, 2019, 35 (17) : 3157 - 3159
  • [6] Prediction of Prostate Cancer using Ensemble of Machine Learning Techniques
    Oyewo, O. A.
    Boyinbode, O. K.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (03) : 149 - 154
  • [7] Classification of Cancer Primary Sites Using Machine Learning and Somatic Mutations
    Chen, Yukun
    Sun, Jingchun
    Huang, Liang-Chin
    Xu, Hua
    Zhao, Zhongming
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [8] Ensemble learning based predictive framework for virtual machine resource request prediction
    Kumar, Jitendra
    Singh, Ashutosh Kumar
    Buyya, Rajkumar
    NEUROCOMPUTING, 2020, 397 : 20 - 30
  • [9] An Effective Heart Disease Prediction Framework based on Ensemble Techniques in Machine Learning
    Yewale, Deepali
    Vijayaragavan, S. P.
    Bairagi, V. K.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (02) : 182 - 190
  • [10] Prediction of Antidepressant Treatment Response and Remission Using an Ensemble Machine Learning Framework
    Lin, Eugene
    Kuo, Po-Hsiu
    Liu, Yu-Li
    Yu, Younger W-Y
    Yang, Albert C.
    Tsai, Shih-Jen
    PHARMACEUTICALS, 2020, 13 (10) : 1 - 12