Hierarchical multi-label classification based on over-sampling and hierarchy constraint for gene function prediction

被引:10
|
作者
Chen, Benhui [1 ,2 ]
Hu, Jinglu [1 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, Wakamatsu Ku, Kitakyushu, Fukuoka 8080135, Japan
[2] Dali Univ, Sch Math & Comp Sci, Dali 671003, Yunnan, Peoples R China
关键词
hierarchical multi-label classification; imbalanced dataset learning; hierarchical SMOTE; consistency ensemble;
D O I
10.1002/tee.21714
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. Gene function prediction is a complicated HMC problem with large class number and usually strongly imbalanced class distributions. This paper proposes an improved HMC method based on over-sampling and hierarchy constraint for solving the gene function prediction problem. The HMC task is transferred into a set of binary support vector machine (SVM) classification tasks. Then, two measures are implemented to enhance the HMC performance by introducing the hierarchy constraint into learning procedures. Firstly, for imbalanced classes, a hierarchical synthetic minority over-sampling technique (SMOTE) is proposed as over-sampling preprocessing to improve the SVM learning performance. Secondly, an improved True Path Rule (TPR) ensemble approach is introduced to combine the results of binary probabilistic SVM classifications. It can improve the classification results and guarantee the hierarchy constraint of classes. Experiment results on four benchmark FunCat Yeast datasets show that the proposed method significantly outperforms the basic TPR method and the Flat ensemble method. (C) 2012 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 50 条
  • [21] A novel ensemble over-sampling approach based Chebyshev inequality for imbalanced multi-label data
    Ren, Weishuo
    Zheng, Yifeng
    Zhang, Wenjie
    Qing, Depeng
    Zeng, Xianlong
    Li, Guohe
    NEUROCOMPUTING, 2025, 612
  • [22] Hierarchy exploitation to detect missing annotations on hierarchical multi-label classification
    Romero, Miguel
    Nakano, Felipe Kenji
    Finke, Jorge
    Rocha, Camilo
    Vens, Celine
    arXiv, 2022,
  • [23] Hierarchical multi-label classification based on LSTM network and Bayesian decision theory for LncRNA function prediction
    Shou Feng
    Huiying Li
    Jiaqing Qiao
    Scientific Reports, 12
  • [24] Hierarchical multi-label classification based on LSTM network and Bayesian decision theory for LncRNA function prediction
    Feng, Shou
    Li, Huiying
    Qiao, Jiaqing
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [25] Multi-Label Hierarchical Classification using a Competitive Neural Network for Protein Function Prediction
    Borges, Helyane Bronoski
    Nievola, Julio Cesar
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [26] Exploiting MEDLINE for gene molecular function prediction via NMF based multi-label classification
    Fodeh, Samah Jamal
    Tiwari, Aditya
    JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 86 : 160 - 166
  • [27] Cluster Tree based Multi-Label Classification for Protein Function Prediction
    Wu, Qingyao
    Ye, Yunming
    Zhang, Xiaofeng
    Ho, Shen-Shyang
    2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [28] Leveraging class hierarchy for detecting missing annotations on hierarchical multi-label classification
    Romero, Miguel
    Nakano, Felipe Kenji
    Finke, Jorge
    Rocha, Camilo
    Vens, Celine
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 152
  • [29] Hierarchical Multi-Label Gene Function Prediction using Adaptive Mutation in Crowding Niching
    Kordmahalleh, Mina Moradi
    Homaifar, Abdollah
    Kc, Dukka B.
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2013,
  • [30] ReliefF for Hierarchical Multi-label Classification
    Slavkov, Ivica
    Karcheska, Jana
    Kocev, Dragi
    Kalajdziski, Slobodan
    Dzeroski, Saso
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, NFMCP 2013, 2014, 8399 : 148 - 161