DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction

被引:50
|
作者
You, Ronghui [1 ]
Yao, Shuwei [1 ]
Mamitsuka, Hiroshi [2 ,3 ]
Zhu, Shanfeng [4 ,5 ,6 ,7 ,8 ,9 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Kyoto Univ, Bioinformat Ctr, Inst Chem Res, Uji, Kyoto 6110011, Japan
[3] Aalto Univ, Dept Comp Sci, Espoo, Finland
[4] Fudan Univ, Inst Sci & Technol Brain Inspired Intellige, Shanghai 200433, Peoples R China
[5] Fudan Univ, Shanghai Inst Artificial Intelligence Algorithms, Shanghai 200433, Peoples R China
[6] Fudan Univ, Key Lab Computat Neurosci & Brain Inspired Intell, Minist Educ, Shanghai 200433, Peoples R China
[7] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China
[8] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[9] Zhangjiang Fudan Int Innovat Ctr, Shanghai 200433, Peoples R China
基金
芬兰科学院; 中国国家自然科学基金;
关键词
ONTOLOGY; DATABASE;
D O I
10.1093/bioinformatics/btab270
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Automated function prediction (AFP) of proteins is a large-scale multi-label classification problem. Two limitations of most network-based methods for AFP are (i) a single model must be trained for each species and (ii) protein sequence information is totally ignored. These limitations cause weaker performance than sequence-based methods. Thus, the challenge is how to develop a powerful network-based method for AFP to overcome these limitations. Results: We propose DeepGraphGO, an end-to-end, multispecies graph neural network-based method for AFP, which makes the most of both protein sequence and high-order protein network information. Our multispecies strategy allows one single model to be trained for all species, indicating a larger number of training samples than existing methods. Extensive experiments with a large-scale dataset show that DeepGraphGO outperforms a number of competing state-of-the-art methods significantly, including DeepGOPlus and three representative network-based methods: GeneMANIA, deepNF and clusDCA. We further confirm the effectiveness of our multispecies strategy and the advantage of DeepGraphGO over so-called difficult proteins. Finally, we integrate DeepGraphGO into the state-of-the-art ensemble method, NetGO, as a component and achieve a further performance improvement.
引用
收藏
页码:I262 / I271
页数:10
相关论文
共 50 条
  • [1] GRAPH NEURAL NETWORK FOR LARGE-SCALE NETWORK LOCALIZATION
    Yan, Wenzhong
    Jin, Di
    Lin, Zhidi
    Yin, Feng
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5250 - 5254
  • [2] Survey on Large-scale Graph Neural Network Systems
    Zhao, Gang
    Wang, Qian-Ge
    Yao, Feng
    Zhang, Yan-Feng
    Yu, Ge
    [J]. Ruan Jian Xue Bao/Journal of Software, 2022, 33 (01): : 150 - 170
  • [3] Usage of a Graph Neural Network for Large-Scale Network Performance Evaluation
    Wang, Cen
    Yoshikane, Noboru
    Tsuritani, Takehiro
    [J]. 2021 INTERNATIONAL CONFERENCE ON OPTICAL NETWORK DESIGN AND MODELLING (ONDM), 2021,
  • [4] XGCN: a library for large-scale graph neural network recommendations
    Song, Xiran
    Huang, Hong
    Lian, Jianxun
    Jin, Hai
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (03)
  • [5] XGCN: a library for large-scale graph neural network recommendations
    Xiran Song
    Hong Huang
    Jianxun Lian
    Hai Jin
    [J]. Frontiers of Computer Science, 2024, 18
  • [6] NetGO: improving large-scale protein function prediction with massive network information
    You, Ronghui
    Yao, Shuwei
    Xiong, Yi
    Huang, Xiaodi
    Sun, Fengzhu
    Mamitsuka, Hiroshi
    Zhu, Shanfeng
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (W1) : W379 - W387
  • [7] Prediction of New Media Information Dissemination Speed and Scale Effect Based on Large-Scale Graph Neural Network
    Chen, Qiumeng
    Shen, Yu
    [J]. SCIENTIFIC PROGRAMMING, 2022, 2022
  • [8] A large-scale evaluation of computational protein function prediction
    Radivojac P.
    Clark W.T.
    Oron T.R.
    Schnoes A.M.
    Wittkop T.
    Sokolov A.
    Graim K.
    Funk C.
    Verspoor K.
    Ben-Hur A.
    Pandey G.
    Yunes J.M.
    Talwalkar A.S.
    Repo S.
    Souza M.L.
    Piovesan D.
    Casadio R.
    Wang Z.
    Cheng J.
    Fang H.
    Gough J.
    Koskinen P.
    Törönen P.
    Nokso-Koivisto J.
    Holm L.
    Cozzetto D.
    Buchan D.W.A.
    Bryson K.
    Jones D.T.
    Limaye B.
    Inamdar H.
    Datta A.
    Manjari S.K.
    Joshi R.
    Chitale M.
    Kihara D.
    Lisewski A.M.
    Erdin S.
    Venner E.
    Lichtarge O.
    Rentzsch R.
    Yang H.
    Romero A.E.
    Bhat P.
    Paccanaro A.
    Hamp T.
    Kaßner R.
    Seemayer S.
    Vicedo E.
    Schaefer C.
    [J]. Nature Methods, 2013, 10 (3) : 221 - 227
  • [9] A large-scale evaluation of computational protein function prediction
    Radivojac, Predrag
    Clark, Wyatt T.
    Oron, Tal Ronnen
    Schnoes, Alexandra M.
    Wittkop, Tobias
    Sokolov, Artem
    Graim, Kiley
    Funk, Christopher
    Verspoor, Karin
    Ben-Hur, Asa
    Pandey, Gaurav
    Yunes, Jeffrey M.
    Talwalkar, Ameet S.
    Repo, Susanna
    Souza, Michael L.
    Piovesan, Damiano
    Casadio, Rita
    Wang, Zheng
    Cheng, Jianlin
    Fang, Hai
    Goughl, Julian
    Koskinen, Patrik
    Toronen, Petri
    Nokso-Koivisto, Jussi
    Holm, Liisa
    Cozzetto, Domenico
    Buchan, Daniel W. A.
    Bryson, Kevin
    Jones, David T.
    Limaye, Bhakti
    Inamdar, Harshal
    Datta, Avik
    Manjari, Sunitha K.
    Joshi, Rajendra
    Chitale, Meghana
    Kihara, Daisuke
    Lisewski, Andreas M.
    Erdin, Serkan
    Venner, Eric
    Lichtarge, Olivier
    Rentzsch, Robert
    Yang, Haixuan
    Romero, Alfonso E.
    Bhat, Prajwal
    Paccanaro, Alberto
    Hamp, Tobias
    Kassner, Rebecca
    Seemayer, Stefan
    Vicedo, Esmeralda
    Schaefer, Christian
    [J]. NATURE METHODS, 2013, 10 (03) : 221 - 227
  • [10] TIGER: Training Inductive Graph Neural Network for Large-scale Knowledge Graph Reasoning
    Wang, Kai
    Xu, Yuwei
    Luo, Siqiang
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (10): : 2459 - 2472