Genetic variant pathogenicity prediction trained using disease-specific clinical sequencing data sets

被引:15
|
作者
Evans, Perry [1 ]
Wu, Chao [2 ]
Lindy, Amanda [3 ]
McKnight, Dianalee A. [3 ]
Lebo, Matthew [4 ,5 ]
Sarmady, Mahdi [2 ,6 ]
Abou Tayoun, Ahmad N. [2 ,6 ,7 ]
机构
[1] Childrens Hosp Philadelphia, Dept Biomed & Hlth Informat, Philadelphia, PA 19104 USA
[2] Childrens Hosp Philadelphia, Div Genom Diagnost, Philadelphia, PA 19104 USA
[3] GeneDx, Gaithersburg, MD 20877 USA
[4] Partners HealthCare Personalized Med, Lab Mol Med, Cambridge, MA 02139 USA
[5] Harvard Med Sch, Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[6] Univ Penn, Dept Pathol & Lab Med, Perelman Sch Med, Philadelphia, PA 19104 USA
[7] Al Jalila Childrens Specialty Hosp, Dubai, U Arab Emirates
关键词
GENOME; SUBSTITUTIONS; MUTATION;
D O I
10.1101/gr.240994.118
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in DNA sequencing have expanded our understanding of the molecular basis of genetic disorders and increased the utilization of clinical genomic tests. Given the paucity of evidence to accurately classify each variant and the difficulty of experimentally evaluating its clinical significance, a large number of variants generated by clinical tests are reported as variants of unknown clinical significance. Population-scale variant databases can improve clinical interpretation. Specifically, pathogenicity prediction for novel missense variants can use features describing regional variant constraint. Constrained genomic regions are those that have an unusually low variant count in the general population. Computational methods have been introduced to capture these regions and incorporate them into pathogenicity classifiers, but these methods have yet to be compared on an independent clinical variant data set. Here, we introduce one variant data set derived from clinical sequencing panels and use it to compare the ability of different genomic constraint metrics to determine missense variant pathogenicity. This data set is compiled from 17,071 patients surveyed with clinical genomic sequencing for cardiomyopathy, epilepsy, or RASopathies. We further use this data set to demonstrate the necessity of disease-specific classifiers and to train PathoPredictor, a disease-specific ensemble classifier of pathogenicity based on regional constraint and variant-level features. PathoPredictor achieves an average precision >90% for variants from all 99 tested disease genes while approaching 100% accuracy for some genes. The accumulation of larger clinical variant training data sets can significantly enhance their performance in a disease-and gene-specific manner.
引用
收藏
页码:1144 / 1151
页数:8
相关论文
共 50 条
  • [31] Disease-Specific Autoantibodies Induce Trained Immunity in RA Synovial Tissues and Its Gene Signature Correlates with the Response to Clinical Therapy
    Dai, Xiaoli
    Dai, Xiaoqiu
    Gong, Zheng
    Yang, Chen
    Zeng, Keqin
    Gong, Fang-Yuan
    Zhong, Qiao
    Gao, Xiao-Ming
    [J]. MEDIATORS OF INFLAMMATION, 2020, 2020
  • [32] Liver toxicity prediction and classification using microarray data: Application of reference data-trained models to customers' data sets.
    Castle, AL
    Johnson, KR
    Higgs, BW
    Porter, MW
    Elashoff, M
    Chang, CG
    Mendrick, D
    [J]. TOXICOLOGICAL SCIENCES, 2003, 72 : 244 - 244
  • [33] Paradoxical Psoriasis Induced by Anti-TNFα Treatment: Evaluation of Disease-Specific Clinical and Genetic Markers
    Bucalo, Agostino
    Rega, Federica
    Zangrilli, Arianna
    Silvestri, Valentina
    Valentini, Virginia
    Scafetta, Giorgia
    Marraffa, Federica
    Grassi, Sara
    Rogante, Elena
    Piccolo, Arianna
    Cucchiara, Salvatore
    Viola, Franca
    Bianchi, Luca
    Ottini, Laura
    Richetta, Antonio
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (21) : 1 - 13
  • [34] Using large clinical data sets to infer pathogenicity for rare copy number variants in autism cohorts
    D Moreno-De-Luca
    S J Sanders
    A J Willsey
    J G Mulle
    J K Lowe
    D H Geschwind
    M W State
    C L Martin
    D H Ledbetter
    [J]. Molecular Psychiatry, 2013, 18 : 1090 - 1095
  • [35] Using large clinical data sets to infer pathogenicity for rare copy number variants in autism cohorts
    Moreno-De-Luca, D.
    Sanders, S. J.
    Willsey, A. J.
    Mulle, J. G.
    Lowe, J. K.
    Geschwind, D. H.
    State, M. W.
    Martin, C. L.
    Ledbetter, D. H.
    [J]. MOLECULAR PSYCHIATRY, 2013, 18 (10) : 1090 - 1095
  • [36] Disease-specific data processing: An intelligent digital platform for diabetes based on model prediction and data analysis utilizing big data technology
    Kong, Xiangyong
    Peng, Ruiyang
    Dai, Huajie
    Li, Yichi
    Lu, Yanzhuan
    Sun, Xiaohan
    Zheng, Bozhong
    Wang, Yuze
    Zhao, Zhiyun
    Liang, Shaolin
    Xu, Min
    [J]. FRONTIERS IN PUBLIC HEALTH, 2022, 10
  • [37] Overcoming disease-specific matrix effect in a clinical pharmacokinetic assay using a microfluidic immunoassay technology
    Williams, Kathi
    Erickson, Rich
    Fischer, Saloumeh Kadkhodayan
    [J]. BIOANALYSIS, 2017, 9 (16) : 1207 - 1216
  • [38] Using automated clinical data for risk adjustment - Development and validation of six disease-specific mortality predictive models for pay-for-performance
    Tabak, Ying P.
    Johannes, Richard S.
    Silber, Jeffrey H.
    [J]. MEDICAL CARE, 2007, 45 (08) : 789 - 805
  • [39] In silico prediction of the pathogenic effect of a novel variant of BCKDHA leading to classical maple syrup urine disease identified using clinical exome sequencing
    Fernandez-Lainez, Cynthia
    Alaez-Verson, Carmen
    Ibarra-Gonzalez, Isabel
    Enriquez-Flores, Sergio
    Carrillo-Sanchez, Karol
    Flores-Lagunes, Leonardo
    Guillen-Lopez, Sara
    Belmont-Martinez, Leticia
    Vela-Amieva, Marcela
    [J]. CLINICA CHIMICA ACTA, 2018, 483 : 33 - 38
  • [40] IMPROVING THE PREDICTION OF 10-YEAR RISK OF CARDIOVASCULAR DISEASE USING TRADITIONAL AND DISEASE-SPECIFIC PREDICTORS IN PATIENTS WITH RHEUMATOID ARTHRITIS
    Schulpen, M.
    Puts, G. C.
    Arts, E.
    den Broeder, A. A.
    Popa, C. D.
    Fransen, J.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2015, 74 : 232 - 232