On the prediction of non-CG DNA methylation using machine learning

被引:2
|
作者
Sereshki, Saleh [1 ]
Lee, Nathan [1 ]
Omirou, Michalis [2 ]
Fasoula, Dionysia [3 ]
Lonardi, Stefano [1 ]
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
[2] Agr Res Inst, Dept Agrobiotechnol, Agr Microbiol Lab, CY-1516 Nicosia, Cyprus
[3] Agr Res Inst, Dept Plant Breeding, CY-1516 Nicosia, Cyprus
关键词
GENOME;
D O I
10.1093/nargab/lqad045
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
DNA methylation can be detected and measured using sequencing instruments after sodium bisulfite conversion, but experiments can be expensive for large eukaryotic genomes. Sequencing nonuniformity and mapping biases can leave parts of the genome with low or no coverage, thus hampering the ability of obtaining DNA methylation levels for all cytosines. To address these limitations, several computational methods have been proposed that can predict DNA methylation from the DNA sequence around the cytosine or from the methylation level of nearby cytosines. However, most of these methods are entirely focused on CG methylation in humans and other mammals. In this work, we study, for the first time, the problem of predicting cytosine methylation for CG, CHG and CHH contexts on six plant species, either from the DNA primary sequence around the cytosine or from the methylation levels of neighboring cytosines. In this framework, we also study the cross-species prediction problem and the cross-context prediction problem (within the same species). Finally, we show that providing gene and repeat annotations allows existing classifiers to significantly improve their prediction accuracy. We introduce a new classifier called AMPS (annotation-based methylation prediction from sequence) that takes advantage of genomic annotations to achieve higher accuracy.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Conditional GWAS of non-CG transposon methylation in Arabidopsis thaliana reveals major polymorphisms in five genes
    Sasaki, Eriko
    Gunis, Joanna
    Reichardt-Gomez, Ilka
    Nizhynska, Viktoria
    Nordborg, Magnus
    PLOS GENETICS, 2022, 18 (09):
  • [42] Non-CG methylation and multiple histone profiles associate child abuse with immune and small GTPase dysregulation
    Lutz, Pierre-Eric
    Chay, Marc-Aurele
    Pacis, Alain
    Chen, Gary G.
    Aouabed, Zahia
    Maffioletti, Elisabetta
    Theroux, Jean-Francois
    Grenier, Jean-Christophe
    Yang, Jennie
    Aguirre, Maria
    Ernst, Carl
    Redensek, Adriana
    van Kempen, Leon C.
    Yalcin, Ipek
    Kwan, Tony
    Mechawar, Naguib
    Pastinen, Tomi
    Turecki, Gustavo
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [43] Arabidopsis cmt3 chromomethylase mutations block non-CG methylation and silencing of an endogenous gene
    Bartee, L
    Malagnac, F
    Bender, J
    GENES & DEVELOPMENT, 2001, 15 (14) : 1753 - 1758
  • [44] Non-CG methylation and multiple histone profiles associate child abuse with immune and small GTPase dysregulation
    Pierre-Eric Lutz
    Marc-Aurèle Chay
    Alain Pacis
    Gary G. Chen
    Zahia Aouabed
    Elisabetta Maffioletti
    Jean-François Théroux
    Jean-Christophe Grenier
    Jennie Yang
    Maria Aguirre
    Carl Ernst
    Adriana Redensek
    Léon C. van Kempen
    Ipek Yalcin
    Tony Kwan
    Naguib Mechawar
    Tomi Pastinen
    Gustavo Turecki
    Nature Communications, 12
  • [45] Characterization and machine learning prediction of allele-specific DNA methylation
    He, Jianlin
    Sun, Ming-an
    Wang, Zhong
    Wang, Qianfei
    Li, Qing
    Xie, Hehuang
    GENOMICS, 2015, 106 (06) : 331 - 339
  • [46] Similarity between soybean and Arabidopsis seed methylomes and loss of non-CG methylation does not affect seed development
    Lin, Jer-Young
    Le, Brandon H.
    Chen, Min
    Henry, Kelli F.
    Hur, Jungim
    Hsieh, Tzung-Fu
    Chen, Pao-Yang
    Pelletier, Julie M.
    Pellegrini, Matteo
    Fischer, Robert L.
    Harada, John J.
    Goldberg, Robert B.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (45) : E9730 - E9739
  • [47] Multiplex CRISPR-Cas9 editing of DNA methyltransferases in rice uncovers a class of non-CG methylation specific for GC-rich regions
    Hu, Daoheng
    Yu, Yiming
    Wang, Chun
    Long, Yanping
    Liu, Yue
    Feng, Li
    Lu, Dongdong
    Liu, Bo
    Jia, Jinbu
    Xia, Rui
    Du, Jiamu
    Zhong, Xuehua
    Gong, Lei
    Wang, Kejian
    Zhai, Jixian
    PLANT CELL, 2021, 33 (09): : 2950 - 2964
  • [48] Improving the prediction of cardiovascular risk with machine-learning and DNA methylation data
    Cugliari, Giovanni
    Benevenuta, Silvia
    Guarrera, Simonetta
    Sacerdote, Carlotta
    Panico, Salvatore
    Krogh, Vittorio
    Tumino, Rosario
    Vineis, Paolo
    Fariselli, Piero
    Matullo, Giuseppe
    2019 16TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY - CIBCB 2019, 2019, : 39 - 42
  • [49] Mechanistic insights into plant SUVH family H3K9 methyltransferases and their binding to context-biased non-CG DNA methylation
    Li, Xueqin
    Harris, C. Jake
    Zhong, Zhenhui
    Chen, Wei
    Liu, Rui
    Jia, Bei
    Wang, Zonghua
    Li, Sisi
    Jacobsen, Steven E.
    Du, Jiamu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (37) : E8793 - E8802
  • [50] Locus-specific dependency of endogenous silent loci on MOM1 and non-CG methylation in Arabidopsis thaliana
    Habu, Yoshiki
    Yoshikawa, Manabu
    PLANT SIGNALING & BEHAVIOR, 2010, 5 (06) : 724 - 726