共 31 条
Tree polynomials identify a link between co-transcriptional R-loops and nascent RNA folding
被引:0
|作者:
Liu, Pengyu
[1
]
Lusk, Jacob
[1
]
Jonoska, Natasa
[2
]
Vazquez, Mariel
[1
,3
]
机构:
[1] Univ Calif Davis, Dept Microbiol & Mol Genet, Davis, CA 95616 USA
[2] Univ S Florida, Dept Math & Stat, Tampa, FL USA
[3] Univ Calif Davis, Dept Math, Davis, CA 95616 USA
基金:
美国国家科学基金会;
关键词:
DNA sequences - Polynomials - Trees (mathematics);
D O I:
10.1371/journal.pcbi.1012669
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
R-loops are a class of non-canonical nucleic acid structures that typically form during transcription when the nascent RNA hybridizes the DNA template strand, leaving the non-template DNA strand unpaired. These structures are abundant in nature and play important physiological and pathological roles. Recent research shows that DNA sequence and topology affect R-loops, yet it remains unclear how these and other factors contribute to R-loop formation. In this work, we investigate the link between nascent RNA folding and the formation of R-loops. We introduce tree-polynomials, a new class of representations of RNA secondary structures. A tree-polynomial representation consists of a rooted tree associated with an RNA secondary structure together with a polynomial that is uniquely identified with the rooted tree. Tree-polynomials enable accurate, interpretable and efficient data analysis of RNA secondary structures without pseudoknots. We develop a computational pipeline for investigating and predicting R-loop formation from a genomic sequence. The pipeline obtains nascent RNA secondary structures from a co-transcriptional RNA folding software, and computes the tree-polynomial representations of the structures. By applying this pipeline to plasmid sequences that contain R-loop forming genes, we establish a strong correlation between the coefficient sums of tree-polynomials and the experimental probability of R-loop formation. Such strong correlation indicates that the pipeline can be used for accurate R-loop prediction. Furthermore, the interpretability of tree-polynomials allows us to characterize the features of RNA secondary structure associated with R-loop formation. In particular, we identify that branches with short stems separated by bulges and interior loops are associated with R-loops.
引用
收藏
页数:24
相关论文