Predicting gene function using hierarchical multi-label decision tree ensembles

被引:125
|
作者
Schietgat, Leander [1 ]
Vens, Celine [1 ]
Struyf, Jan [1 ]
Blockeel, Hendrik [1 ]
Kocev, Dragi [2 ]
Dzeroski, Saso [2 ]
机构
[1] Katholieke Univ Leuven, Dept Comp Sci, B-3001 Leuven, Belgium
[2] Jozef Stefan Inst, Dept Knowledge Technol, Ljubljana 1000, Slovenia
来源
BMC BIOINFORMATICS | 2010年 / 11卷
基金
美国国家科学基金会; 比利时弗兰德研究基金会;
关键词
PROTEIN FUNCTION; SCALE DATA; CLASSIFICATION; INTEGRATION; ASSOCIATION; ANNOTATION; DATABASE;
D O I
10.1186/1471-2105-11-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: S. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results: We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO). We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions: Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference
    Cesa-Bianchi, Nicolo
    Re, Matteo
    Valentini, Giorgio
    [J]. MACHINE LEARNING, 2012, 88 (1-2) : 209 - 241
  • [22] Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference
    Nicolò Cesa-Bianchi
    Matteo Re
    Giorgio Valentini
    [J]. Machine Learning, 2012, 88 : 209 - 241
  • [23] Feature Ranking for Hierarchical Multi-Label Classification with Tree Ensemble Methods
    Petkovic, Matej
    Dzeroski, Saso
    Kocev, Dragi
    [J]. ACTA POLYTECHNICA HUNGARICA, 2020, 17 (10) : 129 - 148
  • [24] Predicting protein function via multi-label supervised topic model on gene ontology
    Liu, Lin
    Tang, Lin
    He, Libo
    Yao, Shaowen
    Zhou, Wei
    [J]. BIOTECHNOLOGY & BIOTECHNOLOGICAL EQUIPMENT, 2017, 31 (03) : 630 - 638
  • [25] Multi-Label Classification Using Binary Tree of Classifiers
    Law, Anwesha
    Ghosh, Ashish
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (03): : 677 - 689
  • [26] Multi-Label Hierarchical Classification using a Competitive Neural Network for Protein Function Prediction
    Borges, Helyane Bronoski
    Nievola, Julio Cesar
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [27] Hierarchical multi-label classification based on LSTM network and Bayesian decision theory for LncRNA function prediction
    Shou Feng
    Huiying Li
    Jiaqing Qiao
    [J]. Scientific Reports, 12
  • [28] Hierarchical multi-label classification based on LSTM network and Bayesian decision theory for LncRNA function prediction
    Feng, Shou
    Li, Huiying
    Qiao, Jiaqing
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [29] Generating Ensembles of Multi-Label Classifiers Using Cooperative Coevolutionary Algorithms
    Moyano, Jose M.
    Gibaja, Eva L.
    Cios, Krzysztof J.
    Ventura, Sebastian
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1379 - 1386
  • [30] Hierarchical Multi-Label Classification Networks
    Wehrmann, Jonatas
    Cerri, Ricardo
    Barros, Rodrigo C.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80