A framework for improving microRNA prediction in non-human genomes

被引:34
|
作者
Peace, Robert J. [1 ]
Biggar, Kyle K. [2 ,3 ,4 ]
Storey, Kenneth B. [2 ,3 ]
Green, James R. [1 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[2] Carleton Univ, Inst Biochem, Ottawa, ON K1S 5B6, Canada
[3] Carleton Univ, Dept Biol, Ottawa, ON K1S 5B6, Canada
[4] Univ Western Ontario, Dept Biochem, London, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SUPPORT VECTOR MACHINE; EFFECTIVE CLASSIFICATION; RANDOM FOREST; PRECURSORS; IDENTIFICATION; EFFICIENT; SELECTION; SEQUENCE; FEATURES; REGIONS;
D O I
10.1093/nar/gkv698
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The prediction of novel pre-microRNA (miRNA) from genomic sequence has received considerable attention recently. However, the majority of studies have focused on the human genome. Previous studies have demonstrated that sensitivity (correctly detecting true miRNA) is sustained when human-trained methods are applied to other species, however they have failed to report the dramatic drop in specificity (the ability to correctly reject non-miRNA sequences) in non-human genomes. Considering the ratio of true miRNA sequences to pseudo-miRNA sequences is on the order of 1:1000, such low specificity prevents the application of most existing tools to non-human genomes, as the number of false positives overwhelms the true predictions. We here introduce a framework (SMIRP) for creating species-specific miRNA prediction systems, leveraging sequence conservation and phylogenetic distance information. Substantial improvements in specificity and precision are obtained for four non-human test species when our framework is applied to three different prediction systems representing two types of classifiers (support vector machine and Random Forest), based on three different feature sets, with both human-specific and taxon-wide training data. The SMIRP framework is potentially applicable to all miRNA prediction systems and we expect substantial improvement in precision and specificity, while sustaining sensitivity, independent of the machine learning technique chosen.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Human and Non-Human Primate Genomes Share Hotspots of Positive Selection
    Enard, David
    Depaulis, Frantz
    Roest Crollius, Hugues
    PLOS GENETICS, 2010, 6 (02):
  • [2] Diversity of latent AAV genomes in non-human primate and human tissues
    Gao, GP
    Alvira, MR
    Lu, Y
    Vandenberghe, LH
    Somanathan, S
    Lai, LH
    Duplantis, MJ
    Bunnell, BA
    Wilson, JM
    MOLECULAR THERAPY, 2003, 7 (05) : S158 - S158
  • [3] LNA-mediated microRNA silencing in non-human primates
    Elmen, Joacim
    Lindow, Morten
    Schutz, Sylvia
    Lawrence, Matthew
    Petri, Andreas
    Obad, Susanna
    Lindholm, Marie
    Hedtjarn, Maj
    Hansen, Henrik Frydenlund
    Berger, Urs
    Gullans, Steven
    Kearney, Phil
    Sarnow, Peter
    Straarup, Ellen Marie
    Kauppinen, Sakari
    NATURE, 2008, 452 (7189) : 896 - U10
  • [4] LNA-mediated microRNA silencing in non-human primates
    Joacim Elmén
    Morten Lindow
    Sylvia Schütz
    Matthew Lawrence
    Andreas Petri
    Susanna Obad
    Marie Lindholm
    Maj Hedtjärn
    Henrik Frydenlund Hansen
    Urs Berger
    Steven Gullans
    Phil Kearney
    Peter Sarnow
    Ellen Marie Straarup
    Sakari Kauppinen
    Nature, 2008, 452 : 896 - 899
  • [5] Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples
    Liu, Zhandong
    Venkatesh, Santosh S.
    Maley, Carlo C.
    BMC GENOMICS, 2008, 9 (1)
  • [6] IMPROVING THE EFFICIENCY OF POSITIVE REINFORCEMENT TRAINING FOR NON-HUMAN PRIMATES
    Coleman, K.
    Houser, L. A.
    Maier, A.
    AMERICAN JOURNAL OF PRIMATOLOGY, 2013, 75 : 66 - 66
  • [7] Editing and the Biosphere's Genomes: Human Issues for Tomorrow; Non-Human Issues for Today.
    Greely, Henry T.
    IN VITRO CELLULAR & DEVELOPMENTAL BIOLOGY-ANIMAL, 2016, 52 : S4 - S4
  • [8] Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples
    Zhandong Liu
    Santosh S Venkatesh
    Carlo C Maley
    BMC Genomics, 9
  • [9] A diverse group of small circular ssDNA viral genomes in human and non-human primate stools
    Ng, Terry Fei Fan
    Zhang, Wen
    Sachsenroeder, Jana
    Kondov, Nikola O.
    da Costa, Antonio Charlys
    Vega, Everardo
    Holtz, Lori R.
    Wu, Guang
    Wang, David
    Stine, Colin O.
    Antonio, Martin
    Mulvaney, Usha S.
    Muench, Marcus O.
    Deng, Xutao
    Ambert-Balay, Katia
    Pothier, Pierre
    Vinje, Jan
    Delwart, Eric
    VIRUS EVOLUTION, 2015, 1 (01)
  • [10] Prediction and validation of microRNA targets in animal genomes
    Grace Martin
    Katherine Schouest
    Prasad Kovvuru
    Charles Spillane
    Journal of Biosciences, 2007, 32 : 1049 - 1052