CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

被引:6
|
作者
Bharadhwaj, Vinay Srinivas [1 ,2 ]
Ali, Mehdi [3 ,4 ,5 ]
Birkenbihl, Colin [1 ,2 ]
Mubeen, Sarah [1 ,2 ,6 ]
Lehmann, Jens [3 ,4 ,5 ]
Hofmann-Apitius, Martin [1 ,2 ]
Hoyt, Charles Tapley [1 ,3 ,6 ]
Domingo-Fernandez, Daniel [1 ,3 ,6 ]
机构
[1] Fraunhofer Inst Algorithms & Sci Comp, Dept Bioinformat, D-53757 St Augustin, Germany
[2] Univ Bonn, Bonn Aachen Int Ctr Informat Technol B IT, D-53115 Bonn, Germany
[3] Rheinische Friedrich Wilhelms Univ Bonn, D-53113 Bonn, Germany
[4] Fraunhofer Inst Intelligent Anal & Informat Syst, Dresden, Germany
[5] Fraunhofer Inst Intelligent Anal & Informat Syst, St Augustin, Germany
[6] Fraunhofer Ctr Machine Learning, Bonn, Germany
关键词
BETA;
D O I
10.1093/bioinformatics/btab340
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A Summary: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation.
引用
收藏
页码:3311 / 3318
页数:8
相关论文
共 50 条
  • [1] A data- and knowledge-driven framework for digital twin manufacturing cell
    Zhang, Chao
    Zhou, Guanghui
    He, Jun
    Li, Zhi
    Cheng, Wei
    [J]. 11TH CIRP CONFERENCE ON INDUSTRIAL PRODUCT-SERVICE SYSTEMS, 2019, 83 : 345 - 350
  • [2] Spatial modelling of disease using data- and knowledge-driven approaches
    Stevens, Kim B.
    Pfeiffer, Dirk U.
    [J]. SPATIAL AND SPATIO-TEMPORAL EPIDEMIOLOGY, 2011, 2 (03) : 125 - 133
  • [3] A data- and knowledge-driven framework for developing machine learning models to predict soccer match outcomes
    Berrar, Daniel
    Lopes, Philippe
    Dubitzky, Werner
    [J]. MACHINE LEARNING, 2024, 113 (10) : 8165 - 8204
  • [4] Data- and knowledge-driven mineral prospectivity maps for Canada's North
    Harris, J. R.
    Grunsky, E.
    Behnia, P.
    Corrigan, D.
    [J]. ORE GEOLOGY REVIEWS, 2015, 71 : 788 - 803
  • [5] A systematic comparison of data- and knowledge-driven approaches to disease subtype discovery
    Rintala, Teemu J.
    Federico, Antonio
    Latonen, Leena
    Greco, Dario
    Fortino, Vittorio
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [6] Decision support based on genomics: integration of data- and knowledge-driven reasoning
    Sfakianakis, S.
    Blazantonakis, M.
    Dimou, I.
    Zervakis, M.
    Tsiknakis, M.
    Potamias, G.
    Kafetzopoulos, D.
    Lowe, D.
    [J]. INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2010, 3 (3-4) : 287 - 307
  • [7] Recent advances in data- and knowledge-driven approaches to explore primary microbial metabolism
    Bartmanski, Bartosz Jan
    Rocha, Miguel
    Zimmermann-Kogadeeva, Maria
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2023, 75
  • [8] Dual Data- and Knowledge-Driven Land Cover Mapping Framework for Monitoring Annual and Near-Real-Time Changes
    Du, Zhenrong
    Yu, Le
    Arvor, Damien
    Li, Xiyu
    Cao, Xin
    Zhong, Liheng
    Zhao, Qiang
    Ma, Xiaorui
    Wang, Hongyu
    Liu, Xiaoxuan
    Zhang, Mingjuan
    Xu, Bing
    Gong, Peng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [9] Exploring data- and knowledge-driven methods for adaptive activity learning with dynamically available contexts
    Wen, Jiahui
    Indulska, Jadwiga
    Zhong, Mingyang
    Cheng, Xiaohui
    Ma, Jingwei
    [J]. CCF TRANSACTIONS ON PERVASIVE COMPUTING AND INTERACTION, 2019, 1 (01) : 24 - 46
  • [10] Exploring data- and knowledge-driven methods for adaptive activity learning with dynamically available contexts
    Jiahui Wen
    Jadwiga Indulska
    Mingyang Zhong
    Xiaohui Cheng
    Jingwei Ma
    [J]. CCF Transactions on Pervasive Computing and Interaction, 2019, 1 : 24 - 46