Handling Missing Data with Graph Representation Learning

被引:0
|
作者
You, Jiaxuan [1 ]
Ma, Xiaobai [2 ]
Ding, Daisy Yi [3 ]
Kochenderfer, Mykel [2 ]
Leskovec, Jure [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA USA
[3] Stanford Univ, Dept Biomed Data Sci, Stanford, CA USA
关键词
MULTIPLE IMPUTATION; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks, compared with existing state-of-the-art methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Handling missing data from heteroskedastic and nonstationary data
    Nelwamondo, Fulufhelo V.
    Marwala, Tshilidzi
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 1293 - +
  • [42] A study of handling missing data methods for big data
    Ezzine, Imane
    Benhlima, Laila
    2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 498 - 501
  • [43] Handling Missing Data in the Modeling of Intensive Longitudinal Data
    Ji, Linying
    Chow, Sy-Miin
    Schermerhom, Alice C.
    Jacobson, Nicholas C.
    Cummings, E. Mark
    STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2018, 25 (05) : 715 - 736
  • [44] Graph Reconfigurable Pooling for Graph Representation Learning
    Li, Xiaolin
    Xu, Qikui
    Xu, Zhenyu
    Zhang, Hongyan
    Xu, Li
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2024, 12 (01) : 139 - 149
  • [45] Graph Propagation Transformer for Graph Representation Learning
    Chen, Zhe
    Tan, Hao
    Wang, Tao
    Shen, Tianrun
    Lu, Tong
    Peng, Qiuying
    Cheng, Cheng
    Qi, Yue
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 3559 - 3567
  • [46] Topological Graph Representation Learning on Property Graph
    Zhang, Yishuo
    Gao, Daniel
    Cherukuri, Aswani Kumar
    Wang, Lei
    Pan, Shaowei
    Li, Shu
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 53 - 64
  • [47] AGCL: Adaptive Graph Contrastive Learning for graph representation learning
    Yu, Jiajun
    Jia, Adele Lu
    NEUROCOMPUTING, 2024, 566
  • [48] Fuzzy Representation Learning on Graph
    Zhang, Chun-Yang
    Lin, Yue-Na
    Chen, C. L. Philip
    Yao, Hong-Yu
    Cai, Hai-Chun
    Fang, Wu-Peng
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (10) : 3358 - 3370
  • [49] Graph Representation Learning Hamilton
    Hamilton W.L.
    Hamilton, William L., 1600, Morgan and Claypool Publishers (14): : 1 - 159
  • [50] Graph representation learning: a survey
    Chen, Fenxiao
    Wang, Yun-Cheng
    Wang, Bin
    Kuo, C. -C. Jay
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9