Active Learning algorithm for Threshold of Decision Probability on Imbalanced Text Classification based on Protein-Protein Interaction Documents

被引:3
|
作者
Xu, Guixian [1 ,2 ]
Niu, Zhendong [1 ]
Gao, Xu [3 ]
Cao, Yujuan [1 ]
Zhao, Yumin [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing, Peoples R China
[2] Minzu Univ, Coll Informat Engn, Beijing, Peoples R China
[3] North China Grid Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
imbalanced text classification; machine learning; protein-protein interaction;
D O I
10.1109/DSDE.2010.28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The study of host pathogen protein-protein interactions (PPIs) is essential to understand the disease-causing mechanisms of human pathogens. A large number of scientific findings about PPIs are generated in the biomedical literatures. Building a document classification system can accelerate the process of mining and curation of PPI knowledge. With more and more imbalanced dataset appearing, how to handle the imbalanced classification problem is becoming a hot topic in machine learning field. In this paper, we propose an Active Learning algorithm for Threshold of Decision Probability (ALTDP) to solve problem of misclassifying the minority class based on imbalanced host pathogen PPIs data set. The results demonstrate the proposed approach is significant to improve the accuracy of classification on imbalanced data set.
引用
收藏
页码:78 / 82
页数:5
相关论文
共 50 条
  • [21] SPPS: A Sequence-Based Method for Predicting Probability of Protein-Protein Interaction Partners
    Liu, Xinyi
    Liu, Bin
    Huang, Zhimin
    Shi, Ting
    Chen, Yingyi
    Zhang, Jian
    PLOS ONE, 2012, 7 (01):
  • [22] Prediction of protein-protein interaction types using association rule based classification
    Park, Sung Hee
    Reyes, Jose A.
    Gilbert, David R.
    Kim, Ji Woong
    Kim, Sangsoo
    BMC BIOINFORMATICS, 2009, 10
  • [23] Prediction of protein-protein interaction types using association rule based classification
    Sung Hee Park
    José A Reyes
    David R Gilbert
    Ji Woong Kim
    Sangsoo Kim
    BMC Bioinformatics, 10
  • [24] Structure-based assessment and druggability classification of protein-protein interaction sites
    Alzyoud, Lara
    Bryce, Richard A.
    Al Sorkhy, Mohammad
    Atatreh, Noor
    Ghattas, Mohammad A.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [25] Pathway prediction in protein-protein interaction networks based on hierarchical clustering algorithm
    Wang, Shuqin
    Li, Yinzhu
    Liu, Peiyan
    Wei, Jinmao
    Journal of Bionanoscience, 2013, 7 (04): : 478 - 483
  • [26] Functional modules detection based on bat algorithm in protein-protein interaction networks
    Xu J.-H.
    Ji J.-Z.
    Yang C.-C.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (08): : 1618 - 1629
  • [27] UDoNC: An Algorithm for Identifying Essential Proteins Based on Protein Domains and Protein-Protein Interaction Networks
    Peng, Wei
    Wang, Jianxin
    Cheng, Yingjiao
    Lu, Yu
    Wu, Fangxiang
    Pan, Yi
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (02) : 276 - 288
  • [28] Identifying Protein Complexes in Dynamic Protein-Protein Interaction Networks Based on Cuckoo Search Algorithm
    Zhao, Jie
    Lei, Xiujuan
    Wu, Fang-Xiang
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1288 - 1295
  • [29] Protein-Protein Interaction Prediction via Structure-Based Deep Learning
    Liu, Yucong
    Liu, Yijun
    Li, Zhenhai
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2024, 92 (11) : 1287 - 1296
  • [30] Classification and prediction of protein–protein interaction interface using machine learning algorithm
    Subhrangshu Das
    Saikat Chakrabarti
    Scientific Reports, 11