Handling Missing Data with Graph Representation Learning

被引:0
|
作者
You, Jiaxuan [1 ]
Ma, Xiaobai [2 ]
Ding, Daisy Yi [3 ]
Kochenderfer, Mykel [2 ]
Leskovec, Jure [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA USA
[3] Stanford Univ, Dept Biomed Data Sci, Stanford, CA USA
关键词
MULTIPLE IMPUTATION; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks, compared with existing state-of-the-art methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Dynamic Graph Representation for Occlusion Handling in Biometrics
    Ren, Min
    Wang, Yunlong
    Sun, Zhenan
    Tan, Tieniu
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11940 - 11947
  • [22] Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data
    Betancourt, Clara
    Li, Cathy W. Y.
    Kleinert, Felix
    Schultz, Martin G.
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2023, 57 (46) : 18246 - 18258
  • [23] Data Representation and Learning with Graph Diffusion-Embedding Networks
    Jiang, Bo
    Lin, Doudou
    Tang, Jin
    Luo, Bin
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10406 - 10415
  • [24] Integrated Sparse Coding With Graph Learning for Robust Data Representation
    Zhang, Yupei
    Liu, Shuhui
    IEEE ACCESS, 2020, 8 : 161245 - 161260
  • [25] ENGAGE: Explanation Guided Data Augmentation for Graph Representation Learning
    Shi, Yucheng
    Zhou, Kaixiong
    Liu, Ninghao
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT III, 2023, 14171 : 104 - 121
  • [26] Data Compression as a Comprehensive Framework for Graph Drawing and Representation Learning
    Plant, Claudia
    Biedermann, Sonja
    Boehm, Christian
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1212 - 1222
  • [27] Learning the Implicit Semantic Representation on Graph-Structured Data
    Wu, Likang
    Li, Zhi
    Zhao, Hongke
    Liu, Qi
    Wang, Jun
    Zhang, Mengdi
    Chen, Enhong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 3 - 19
  • [28] Graph Constrained Data Representation Learning for Human Motion Segmentation
    Dimiccoli, Mariella
    Garrido, Lluis
    Rodriguez-Corominas, Guillem
    Wendt, Herwig
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1440 - 1449
  • [29] Handling Missing Data with Markov Boundary
    Mohammed, Azhar
    Nguyen, Dang
    Duong, Bao
    Nichols, Melanie
    Nguyen, Thin
    ADVANCED DATA MINING AND APPLICATIONS (ADMA 2022), PT I, 2022, 13725 : 319 - 333
  • [30] A Machine Learning Approach to Mental Disorder Prediction: Handling the Missing Data Challenge
    Mokheleli, Tsholofelo
    Bokaba, Tebogo
    Museba, Tinofirei
    Ntshingila, Nompumelelo
    EMERGING TECHNOLOGIES FOR DEVELOPING COUNTRIES, AFRICATEK 2023, 2024, 520 : 93 - 106