Predicting protein function via multi-label supervised topic model on gene ontology

被引:10
|
作者
Liu, Lin [1 ,2 ]
Tang, Lin [3 ]
He, Libo [1 ]
Yao, Shaowen [4 ]
Zhou, Wei [4 ]
机构
[1] Yunnan Univ, Sch Informat, Kunming, Yunnan, Peoples R China
[2] Yunnan Normal Univ, Sch Informat, Minist Educ, Key Lab Educ Informatizat Nationalities, Kunming, Yunnan, Peoples R China
[3] Yunnan Normal Univ, Key Lab Educ Informatizat Nationalities, Minist Educ, Kunming, Yunnan, Peoples R China
[4] Yunnan Univ, Natl Pilot Sch Software, Kunming, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Topic modelling; protein function; gene ontology; multi-label classification; NETWORKS;
D O I
10.1080/13102818.2017.1307697
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
As the biological datasets accumulate rapidly, computational methods designed to automate protein function prediction are critically needed. The problem of protein function prediction can be considered as a multi-label classification problem resulting in protein functional annotations. Nevertheless, biologists prefer to discover the correlations between protein attributes and functions. We introduce a multi-label supervised topic model into protein function prediction and investigate the advantages of this approach. This topic model can not only work out the function probability distributions over protein instances effectively, but also directly provide the words probability distributions over functions. To the best of our knowledge, this is the first effort to apply a multi-label supervised topic model to the protein function prediction. In this paper, we model a protein as a document and a function label as a topic. First, a set of protein sequences is formalized into a bag of words. Then, we perform inference and estimate the model parameters to predict protein functions. Experimental results on yeast and human datasets demonstrate the effectiveness of this multi-label supervised topic model on protein function prediction. Meanwhile, the experiments also show that this multi-label supervised topic model delivers superior results over the compared algorithms. In summary, the method discussed in this paper provides a new efficient approach to protein function prediction and reveals more information about functions.
引用
收藏
页码:630 / 638
页数:9
相关论文
共 50 条
  • [1] A Multi-Label Supervised Topic Model Conditioned on Arbitrary Features for Gene Function Prediction
    Liu, Lin
    Tang, Lin
    Jin, Xin
    Zhou, Wei
    GENES, 2019, 10 (01):
  • [2] Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning
    Jiang, Jonathan Q.
    McQuay, Lisa J.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (04) : 1059 - 1069
  • [3] Supervised topic models for multi-label classification
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    NEUROCOMPUTING, 2015, 149 : 811 - 819
  • [4] Multi-Label Emotion Tagging for Online News by Supervised Topic Model
    Zhang, Ying
    Su, Lili
    Yang, Zhifan
    Zhao, Xue
    Yuan, Xiaojie
    WEB TECHNOLOGIES AND APPLICATIONS (APWEB 2015), 2015, 9313 : 67 - 79
  • [5] Hierarchical Multi-label Associative Classification for Protein Function Prediction Using Gene Ontology
    Sangsuriyun, Sawinee
    Rakthanmanon, Thanawin
    Waiyamai, Kitsana
    CHIANG MAI JOURNAL OF SCIENCE, 2019, 46 (01): : 165 - 179
  • [6] LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification
    Zhang, Yongjun
    Wang, Zijian
    Yu, Yongtao
    Chen, Bolun
    Ma, Jialin
    Shi, Liang
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2018, 14 (02) : 18 - 36
  • [7] A Label Distribution Topic Model for Multi-label Classification
    Liu, Lin
    Tang, Lin
    2019 4TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION PROCESSING (ICIIP 2019), 2019, : 52 - 57
  • [8] Multi-label Classification via Label-Topic Pairs
    Chen, Gang
    Peng, Yue
    Wang, Chongjun
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 32 - 44
  • [9] Multi-label Learning via Supervised Autoencoder
    Lian, Siming
    Liu, Jianwei
    Lu, Runkun
    Luo, Xionglin
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9416 - 9421
  • [10] Topic Model Based Multi-Label Classification
    Padmanabhan, Divya
    Bhat, Satyanath
    Shevade, Shirish
    Narahari, Y.
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 996 - 1003