Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs

被引:24
|
作者
Wang, Fan [1 ,2 ]
Chainani, Pranik [3 ]
White, Tommy [3 ]
Yang, Jin [1 ]
Liu, Yu [2 ]
Soibam, Benjamin [3 ]
机构
[1] Xi An Jiao Tong Univ, Dept Oncol, Affiliated Hosp 1, Xian, Shaanxi, Peoples R China
[2] Univ Houston, Dept Biol & Biochem, Houston, TX USA
[3] Univ Houston Downtown, Comp Sci & Engn Technol, Houston, TX 77002 USA
关键词
Long noncoding RNAs; deep learning; triplex; MEG3;
D O I
10.1080/15476286.2018.1551704
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models that predict the genome-wide binding sites deciphered by ChIRP-Seq experiments of 12 different lncRNAs. Among the several deep learning architectures tested, a simple architecture consisting of two convolutional neural network layers performed the best suggesting local sequence patterns as determinants of the interaction. Further interpretation of the kernels in the model revealed that these local sequence patterns form triplex structures with the corresponding lncRNAs. We uncovered several novel triplexes forming domains (TFDs) of these 12 lncRNAs and previously experimentally verified TFDs of lncRNAs HOTAIR and MEG3. We experimentally verified such two novel TFDs of lncRNAs HOTAIR and TUG1 predicted by our method (but previously unreported) using Electrophoretic mobility shift assays. In conclusion, we show that simple deep learning architecture can accurately predict genome-wide binding sites of lncRNAs and interpretation of the models suggest RNA:DNA:DNA triplex formation as a viable mechanism underlying lncRNA-DNA interactions at genome-wide level.
引用
收藏
页码:1468 / 1476
页数:9
相关论文
共 50 条
  • [21] Genome-wide screening for functional long noncoding RNAs in human cells by Cas9 targeting of splice sites
    Ying Liu
    Zhongzheng Cao
    Yinan Wang
    Yu Guo
    Ping Xu
    Pengfei Yuan
    Zhiheng Liu
    Yuan He
    Wensheng Wei
    [J]. Nature Biotechnology, 2018, 36 : 1203 - 1210
  • [22] Genome-wide analysis of DNA binding sites of protein Jun
    Xie, Jianming
    Li, Minli
    Sun, Xiao
    Lu, Zuhong
    [J]. PROGRESS ON POST-GENOME TECHNOLOGIES, 2006, : 365 - 369
  • [23] Genome-Wide Identification, Characterization and Evolutionary Analysis of Long Intergenic Noncoding RNAs in Cucumber
    Hao, Zhiqiang
    Fan, Chunyan
    Cheng, Tian
    Su, Ya
    Wei, Qiang
    Li, Guanglin
    [J]. PLOS ONE, 2015, 10 (03):
  • [24] Genome-Wide Expression Analysis of Long Noncoding RNAs and Their Target Genes in Metafemale Drosophila
    Liu, Xinyu
    Yan, Ran
    Liu, Haosheng
    Zhang, Shuai
    Wang, Ruixue
    Zhang, Bowen
    Sun, Lin
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (09)
  • [25] Genome-Wide Identification of Long Noncoding RNAs in Rat Models of Cardiovascular and Renal Disease
    Gopalakrishnan, Kathirvel
    Kumarasamy, Sivarajan
    Mell, Blair
    Joe, Bina
    [J]. HYPERTENSION, 2015, 65 (01) : 200 - 210
  • [26] Genome-wide differential expression of synaptic long noncoding RNAs in autism spectrum disorder
    Wang, Y.
    Zhao, X.
    Ju, W.
    Flory, M.
    Zhong, J.
    Jiang, S.
    Wang, P.
    Dong, X.
    Tao, X.
    Chen, Q.
    Shen, C.
    Zhong, M.
    Yu, Y.
    Brown, W. T.
    Zhong, N.
    [J]. TRANSLATIONAL PSYCHIATRY, 2015, 5 : e660 - e660
  • [27] Genome-wide analysis of long noncoding RNAs in response to salt stress in Nicotiana tabacum
    Li, Zefeng
    Zhou, Huina
    Xu, Guoyun
    Zhang, Peipei
    Zhai, Niu
    Zheng, Qingxia
    Liu, Pingping
    Jin, Lifeng
    Bai, Ge
    Zhang, Hui
    [J]. BMC PLANT BIOLOGY, 2023, 23 (01)
  • [28] Genome-wide identification and characterization of long noncoding and circular RNAs in germline stem cells
    Xiaoyong Li
    Geng G. Tian
    Yongqiang Zhao
    Ji Wu
    [J]. Scientific Data, 6
  • [29] Genome-Wide Expression Screening Discloses Long Noncoding RNAs Involved in Thyroid Carcinogenesis
    Liyanarachchi, Sandya
    Li, Wei
    Yan, Pearlly
    Bundschuh, Ralf
    Brock, Pamela
    Senter, Leigha
    Ringel, Matthew D.
    de la Chapelle, Albert
    He, Huiling
    [J]. JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM, 2016, 101 (11): : 4005 - 4013
  • [30] Genome-wide analysis of long noncoding RNAs in response to salt stress in Nicotiana tabacum
    Zefeng Li
    Huina Zhou
    Guoyun Xu
    Peipei Zhang
    Niu Zhai
    Qingxia Zheng
    Pingping Liu
    Lifeng Jin
    Ge Bai
    Hui Zhang
    [J]. BMC Plant Biology, 23