RFMirTarget: Predicting Human MicroRNA Target Genes with a Random Forest Classifier

被引:27
|
作者
Mendoza, Mariana R. [1 ]
da Fonseca, Guilherme C. [2 ]
Loss-Morais, Guilherme [2 ]
Alves, Ronnie [3 ]
Margis, Rogerio [2 ]
Bazzan, Ana L. C. [1 ]
机构
[1] Univ Fed Rio Grande do Sul, Inst Informat, Porto Alegre, RS, Brazil
[2] Univ Fed Rio Grande do Sul, Ctr Biotecnol, Porto Alegre, RS, Brazil
[3] Inst Tecnol Vale Desenvolvimento Sustentavel, Belem, Para, Brazil
来源
PLOS ONE | 2013年 / 8卷 / 07期
关键词
MESSENGER-RNAS; BINDING-SITES; IDENTIFICATION; MACHINE; BIOGENESIS; DATABASE; MIRNAS; BAYES;
D O I
10.1371/journal.pone.0070153
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
MicroRNAs are key regulators of eukaryotic gene expression whose fundamental role has already been identified in many cell pathways. The correct identification of miRNAs targets is still a major challenge in bioinformatics and has motivated the development of several computational methods to overcome inherent limitations of experimental analysis. Indeed, the best results reported so far in terms of specificity and sensitivity are associated to machine learning-based methods for microRNA-target prediction. Following this trend, in the current paper we discuss and explore a microRNA-target prediction method based on a random forest classifier, namely RFMirTarget. Despite its well-known robustness regarding general classifying tasks, to the best of our knowledge, random forest have not been deeply explored for the specific context of predicting microRNAs targets. Our framework first analyzes alignments between candidate microRNA-target pairs and extracts a set of structural, thermodynamics, alignment, seed and position-based features, upon which classification is performed. Experiments have shown that RFMirTarget outperforms several well-known classifiers with statistical significance, and that its performance is not impaired by the class imbalance problem or features correlation. Moreover, comparing it against other algorithms for microRNA target prediction using independent test data sets from TarBase and starBase, we observe a very promising performance, with higher sensitivity in relation to other methods. Finally, tests performed with RFMirTarget show the benefits of feature selection even for a classifier with embedded feature importance analysis, and the consistency between relevant features identified and important biological properties for effective microRNA-target gene alignment.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Predicting the target genes of microRNA based on microarray data
    Cao, B.
    Ji, T.
    Zhou, B.
    Zou, J.
    Jiao, G. Q.
    [J]. GENETICS AND MOLECULAR RESEARCH, 2013, 12 (04): : 6059 - 6066
  • [2] Feature Selection using Random Forest Classifier for Predicting Prostate Cancer
    Huljanah, Mia
    Rustam, Zuherman
    Utama, Suarsih
    Siswantining, Titin
    [J]. 9TH ANNUAL BASIC SCIENCE INTERNATIONAL CONFERENCE 2019 (BASIC 2019), 2019, 546
  • [3] Predicting Students Academic Performance using an Improved Random Forest Classifier
    Jayaprakash, Sujith
    Krishnan, Sangeetha
    Jaiganesh, V
    [J]. 2020 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2020, : 238 - 243
  • [4] A RANDOM FOREST CLASSIFIER FOR PREDICTING CEREBRAL VASOSPASM FOLLOWING SUBARACHNOID HEMORRHAGE
    Zarrin, David
    Wilson, Bayard
    Gaonkar, Bilwaj
    Macyszyn, Luke
    Gabel, Eilon
    [J]. ANESTHESIA AND ANALGESIA, 2022, 134 (4S_SUPPL): : 56 - 56
  • [5] Predicting vitamin D deficiency using optimized random forest classifier
    Alloubani, Aladeen
    Abuhaija, Belal
    Almatari, M.
    Jaradat, Ghaith
    Ihnaini, Baha
    [J]. CLINICAL NUTRITION ESPEN, 2024, 60 : 1 - 10
  • [6] Predicting the fate of microRNA target genes based on sequence features
    Pei, Yunfei
    Wang, Xi
    Zhang, Xuegong
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2009, 261 (01) : 17 - 22
  • [7] Predicting with the quantify intensities of transcription factor-target genes binding using random forest technique
    AL-Mashanji, Ameer K.
    AL-Rashid, Sura Z.
    [J]. INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 145 - 161
  • [8] Thresholding a Random Forest Classifier
    Baumann, Florian
    Li, Fangda
    Ehlers, Arne
    Rosenhahn, Bodo
    [J]. ADVANCES IN VISUAL COMPUTING (ISVC 2014), PT II, 2014, 8888 : 95 - 106
  • [9] Random forest classifier with R
    Ghattas, Badih
    [J]. JOURNAL OF THE SFDS, 2019, 160 (02): : 97 - 98
  • [10] Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
    Zhen Chen
    Ningning He
    Yu Huang
    Wen Tao Qin
    Xuhan Liu
    Lei Li
    [J]. Genomics,Proteomics & Bioinformatics, 2018, 16 (06) : 451 - 459