Machine Learning Prediction of Non-Coding Variant Impact in Human Retinal cis-Regulatory Elements

被引:3
|
作者
VandenBosch, Leah S. [1 ]
Luu, Kelsey [1 ]
Timms, Andrew E. [1 ]
Challam, Shriya [1 ]
Wu, Yue [2 ]
Lee, Aaron Y. [2 ,3 ]
Cherry, Timothy J. [1 ,3 ,4 ]
机构
[1] Seattle Childrens Res Inst, Ctr Dev Biol & Regenerat Med, Seattle, WA USA
[2] Univ Washington, Dept Ophthalmol, Seattle, WA USA
[3] Brotman Baty Inst Precis Med, Seattle, WA USA
[4] Univ Washington, Dept Pediat, Seattle, WA USA
来源
基金
美国国家卫生研究院;
关键词
TRANSCRIPTION FACTORS; ENHANCER; BINDING; GRAMMAR; RARE; GKM;
D O I
10.1167/tvst.11.4.16
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Purpose: Prior studies have demonstrated the significance of specific cis-regulatory variants in retinal disease; however, determining the functional impact of regulatory variants remains a major challenge. In this study, we utilized a machine learning approach, trained on epigenomic data from the adult human retina, to systematically quantify the predicted impact of cis-regulatory variants. Methods: We used human retinal DNA accessibility data (ATAC-seq) to determine a set of 18.9k high-confidence, putative cis-regulatory elements. Eighty percent of these elements were used to train a machine learning model utilizing a gapped k-mer support vector machine-based approach. In silico saturation mutagenesis and variant scoring was applied to predict the functional impact of all potential single nucleotide variants within cis-regulatory elements. Impact scores were tested in a 20% hold-out dataset and compared to allele population frequency, phylogenetic conservation, transcription factor (TF) binding motifs, and existing massively parallel reporter assay data. Results: We generated a model that distinguishes between human retinal regulatory elements and negative test sequences with 95% accuracy. Among a hold-out test set of 3.7k human retinal CREs, all possible single nucleotide variants were scored. Variants with negative impact scores correlated with higher phylogenetic conservation of the reference allele, disruption of predicted TF binding motifs, and massively parallel reporter expression. Conclusions: We demonstrated the utility of human retinal epigenomic data to train a machine learning model for the purpose of predicting the impact of non-coding regulatory sequence variants. Our model accurately scored sequences and predicted putative transcription factor binding motifs. This approach has the potential to expedite the characterization of pathogenic non-coding sequence variants in the context of unexplained retinal disease.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Non-coding transcription at cis-regulatory elements: Computational and experimental approaches
    Simonatto, Marta
    Barozzi, Iros
    Natoli, Gioacchino
    [J]. METHODS, 2013, 63 (01) : 66 - 75
  • [2] Cell-specific cis-regulatory elements and mechanisms of non-coding genetic disease in human retina and retinal organoids
    Thomas, Eric D.
    Timms, Andrew E.
    Giles, Sarah
    Harkins-Perry, Sarah
    Lyu, Pin
    Hoang, Thanh
    Qian, Jiang
    Jackson, Victoria E.
    Bahlo, Melanie
    Blackshaw, Seth
    Friedlander, Martin
    Eade, Kevin
    Cherry, Timothy J.
    [J]. DEVELOPMENTAL CELL, 2022, 57 (06) : 820 - +
  • [3] Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva
    Guo, Xiang
    Silva, Joana C.
    [J]. BMC GENOMICS, 2008, 9 (1)
  • [4] Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva
    Xiang Guo
    Joana C Silva
    [J]. BMC Genomics, 9
  • [5] Non-coding variants impact cis-regulatory coordination in a cell type-specific manner
    Pushkarev, Olga
    van Mierlo, Guido
    Kribelbauer, Judith Franziska
    Saelens, Wouter
    Gardeux, Vincent
    Deplancke, Bart
    [J]. GENOME BIOLOGY, 2024, 25 (01):
  • [6] Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape
    Hannah Verdin
    Ana Fernández-Miñán
    Sara Benito-Sanz
    Sandra Janssens
    Bert Callewaert
    Kathleen De Waele
    Jean De Schepper
    Inge François
    Björn Menten
    Karen E. Heath
    José Luis Gómez-Skarmeta
    Elfride De Baere
    [J]. Scientific Reports, 5
  • [7] Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape
    Verdin, Hannah
    Fernandez-Minan, Ana
    Benito-Sanz, Sara
    Janssens, Sandra
    Callewaert, Bert
    De Waele, Kathleen
    De Schepper, Jean
    Francois, Inge
    Menten, Bjeorn
    Heath, Karen E.
    Gomez-Skarmeta, Jose Luis
    De Baere, Elfride
    [J]. SCIENTIFIC REPORTS, 2015, 5
  • [8] Mapping the Cis-Regulatory Architecture of the Human Retina Reveals Non-Coding Genetic Variation in Disease
    Cherry, Timothy
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2019, 60 (09)
  • [9] Cis-regulatory elements and human evolution
    Siepel, Adam
    Arbiza, Leonardo
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 2014, 29 : 81 - 89
  • [10] The identification of cis-regulatory elements: A review from a machine learning perspective
    Li, Yifeng
    Chen, Chih-yu
    Kaye, Alice M.
    Wasserman, Wyeth W.
    [J]. BIOSYSTEMS, 2015, 138 : 6 - 17