共 1 条
Recognition of 3′-end L1, Alu, processed pseudogenes, and mRNA stem-loops in the human genome using sequence-based and structure-based machine-learning models
被引:0
|作者:
Alexander Shein
Anton Zaikin
Maria Poptsova
机构:
[1] National Research University Higher School of Economics,Laboratory of Bioinformatics, Big Data and Information Retrieval School, Faculty of Computer Science
来源:
关键词:
D O I:
暂无
中图分类号:
学科分类号:
摘要:
The role of 3′-end stem-loops in retrotransposition was experimentally demonstrated for transposons of various species, where LINE-SINE retrotransposons share the same 3′-end sequences, containing a stem-loop. We have discovered that 62–68% of processed pseduogenes and mRNAs also have 3′-end stem-loops. We investigated the properties of 3′-end stem-loops of human L1s, Alus, processed pseudogenes and mRNAs that do not share the same sequences, but all have 3′-end stem-loops. We have built sequence-based and structure-based machine-learning models that are able to recognize 3′-end L1, Alu, processed pseudogene and mRNA stem-loops with high performance. The sequence-based models use only sequence information and capture compositional bias in 3′-ends. The structure-based models consider physical, chemical and geometrical properties of dinucleotides composing a stem and position-specific nucleotide content of a loop and a bulge. The most important parameters include shift, tilt, rise, and hydrophilicity. The obtained results clearly point to the existence of structural constrains for 3′-end stem-loops of L1 and Alu, which are probably important for transposition, and reveal the potential of mRNAs to be recognized by the L1 machinery. The proposed approach is applicable to a broader task of recognizing RNA (DNA) secondary structures. The constructed models are freely available at github (https://github.com/AlexShein/transposons/).
引用
收藏
相关论文