Active Learning algorithm for Threshold of Decision Probability on Imbalanced Text Classification based on Protein-Protein Interaction Documents

被引:3
|
作者
Xu, Guixian [1 ,2 ]
Niu, Zhendong [1 ]
Gao, Xu [3 ]
Cao, Yujuan [1 ]
Zhao, Yumin [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing, Peoples R China
[2] Minzu Univ, Coll Informat Engn, Beijing, Peoples R China
[3] North China Grid Co Ltd, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
imbalanced text classification; machine learning; protein-protein interaction;
D O I
10.1109/DSDE.2010.28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The study of host pathogen protein-protein interactions (PPIs) is essential to understand the disease-causing mechanisms of human pathogens. A large number of scientific findings about PPIs are generated in the biomedical literatures. Building a document classification system can accelerate the process of mining and curation of PPI knowledge. With more and more imbalanced dataset appearing, how to handle the imbalanced classification problem is becoming a hot topic in machine learning field. In this paper, we propose an Active Learning algorithm for Threshold of Decision Probability (ALTDP) to solve problem of misclassifying the minority class based on imbalanced host pathogen PPIs data set. The results demonstrate the proposed approach is significant to improve the accuracy of classification on imbalanced data set.
引用
收藏
页码:78 / 82
页数:5
相关论文
共 50 条
  • [1] Imbalanced Text Classification on Host Pathogen Protein-Protein Interaction Documents
    Xu, Guixian
    Niu, Zhendong
    Gao, Xu
    Liu, Hongfang
    [J]. 2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1, 2010, : 418 - 422
  • [2] Semi-Supervised Learning of Text Classification on Bacterial Protein-Protein Interaction documents
    Xu, Guixian
    Niu, Zhendong
    Uetz, Peter
    Gao, Xu
    Qin, Xuping
    Liu, Hongfang
    [J]. 2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 263 - +
  • [3] Classification of Protein-Protein Interaction Full-Text Documents Using Text and Citation Network Features
    Kolchinsky, Artemy
    Abi-Haidar, Alaa
    Kaur, Jasleen
    Hamed, Ahmed Abdeen
    Rocha, Luis M.
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (03) : 400 - 411
  • [4] RETRACTED: Comparison of classification methods on imbalanced protein-protein interaction text set (Retracted Article)
    Xu, Guixian
    Gao, Xu
    Zhao, Xiaobing
    [J]. 2011 INTERNATIONAL CONFERENCE ON ENERGY AND ENVIRONMENTAL SCIENCE-ICEES 2011, 2011, 11 : 2295 - 2301
  • [5] Classification and prediction of protein-protein interaction interface using machine learning algorithm
    Das, Subhrangshu
    Chakrabarti, Saikat
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [6] Protein-protein interaction extraction based on combining TSVM and active learning
    Liu, Jianmiao
    Wang, Haochang
    Zhao, Tiejun
    [J]. Gaojishu Tongxin/Chinese High Technology Letters, 2009, 19 (05): : 480 - 486
  • [7] Active learning for human protein-protein interaction prediction
    Mohamed, Thahir P.
    Carbonell, Jaime G.
    Ganapathiraju, Madhavi K.
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [8] Active learning for human protein-protein interaction prediction
    Thahir P Mohamed
    Jaime G Carbonell
    Madhavi K Ganapathiraju
    [J]. BMC Bioinformatics, 11
  • [9] Active learning for protein function prediction in protein-protein interaction networks
    Xiong, Wei
    Xie, Luyu
    Zhou, Shuigeng
    Guan, Jihong
    [J]. NEUROCOMPUTING, 2014, 145 : 44 - 52
  • [10] An Active Transfer Learning Framework for Protein-Protein Interaction Extraction
    Li, Lishuang
    He, Xinyu
    Zheng, Jieqiong
    Huang, Degen
    Ren, Fuji
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 504 - 511