Impute Gene Expression Missing Values via Biological Networks: Optimal Fusion of Data and Knowledge

被引:2
|
作者
Xiang, Mingrong [1 ]
Hou, Jingyu [1 ]
Luo, Wei [1 ]
Tao, Wenjing [2 ]
Wang, Deshou [2 ]
机构
[1] Deakin Univ, Sch Informat Technol, Melbourne, Vic, Australia
[2] Southwest Univ, Key Lab Freshwater Fish Reprod & Dev, Key Lab Aquat Sci Chongqing, Sch Life Sci,Minist Educ, Chongqing 400715, Peoples R China
来源
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年
关键词
Missing Data Imputation; Biological Network; Gene Expression Data; Graph Neural Network; CHAINED EQUATIONS; IMPUTATION;
D O I
10.1109/IJCNN52387.2021.9533355
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gene expression data often contain missing values that, if not handled properly, may mislead or invalidate the downstream analyses. With the emergence of graph neural networks (GNN), domain knowledge about gene regulation can be leveraged to guide the missing data imputation. We show in this paper, however, that naive application of GNN on the raw gene-expression data can actually lead to worse imputation. We analyse this problem considering both the intrinsic property of GNN message passing and potential data-knowledge inconsistency. We propose two measures towards optimal integration of biological networks in the gene-expression missing data imputation. These include expression data normalisation and a weighting scheme for GNN message passing. Experiments on two different biological networks and gene expression datasets show that our method outperforms state-of-the-art generic imputation algorithms and alternative GNN models, obtaining lower mean absolute error (MAE) consistently.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Smoothing Blemished Gene Expression Microarray Data via Missing Value Imputation
    Cai, Zhipeng
    Shi, Yi
    Song, Meng
    Goebel, Randy
    Lin, Guohui
    2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 5688 - 5691
  • [22] Combined expression data with missing values and gene interaction network analysis: a Markovian integrated approach
    Blanchet, Juliette
    Vignes, Matthieu
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 366 - +
  • [23] Optimal multisensor data fusion for linear systems with missing measurements
    Intelligent Systems Research Lab., Deakin University, Australia
    2008 IEEE International Conference on System of Systems Engineering, SoSE 2008, 2008,
  • [24] Inferring Differential Networks by Integrating Gene Expression Data With Additional Knowledge
    Liu, Chen
    Cai, Dehan
    Zeng, Wucha
    Huang, Yun
    FRONTIERS IN GENETICS, 2021, 12
  • [25] Optimal Multisensor Data Fusion for Linear Systems with Missing Measurements
    Mohamed, Shady M. Korany
    Nahavandi, Saeid
    Deakin, Alfred
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEM OF SYSTEMS ENGINEERING (SOSE), 2008, : 440 - 443
  • [26] Estimating gene networks from expression data and binding location data via Boolean networks
    Hirose, O
    Nariai, N
    Tamada, Y
    Bannai, H
    Imoto, S
    Miyano, S
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2005, PT 3, 2005, 3482 : 349 - 356
  • [27] Discovering missing reactions of metabolic networks by using gene co-expression data
    Zhaleh Hosseini
    Sayed-Amir Marashi
    Scientific Reports, 7
  • [28] Discovering missing reactions of metabolic networks by using gene co-expression data
    Hosseini, Zhaleh
    Marashi, Sayed-Amir
    SCIENTIFIC REPORTS, 2017, 7
  • [29] Optimal search space for clustering gene expression data via consensus
    Hirsch, Michael
    Swift, Stephen
    Liu, Xiohui
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2007, 14 (10) : 1327 - 1341
  • [30] Incorporating literature knowledge in Bayesian Network for inferring gene networks with gene expression data
    Almasri, Eyad
    Larsen, Peter
    Chen, Guanrao
    Dai, Yang
    BIOINFORMATICS RESEARCH AND APPLICATIONS, 2008, 4983 : 184 - 195