GMMDA: Gaussian Mixture Modeling of Graph in Latent Space for Graph Data Augmentation

被引：0

作者：

Li, Yanjin ^{[1
]}

Xu, Linchuan ^{[1
,2
]}

Yamanishi, Kenji ^{[1
]}

机构：

[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Math Informat, Tokyo, Japan

[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

来源：

23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023 | 2023年

关键词：

graph data augmentation; graph neural networks; Gaussian mixture model; semi-supervised learning; minimum description length principle; NETWORKS;

D O I：

10.1109/ICDM58522.2023.00041

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Graph data augmentation (GDA), which manipulates graph structure and/or attributes, has been demonstrated as an effective method for improving the generalization of graph neural networks on semi -supervised node classification. As a data augmentation technique, label -preservation is critical, that is, node labels should not change after data manipulation. However, most existing methods overlook the label-preservation requirements. Determining the label-preserving nature of a GDA method is highly challenging, owing to the non-Euclidean nature of the graph structure. In this study, for the first time, we formulate a label-preserving problem (LPP) in the context of GDA. The LPP is formulated as an optimization problem in which, given a fixed augmentation budget, the objective is to find an augmented graph with minimal difference in data distribution compared to the original graph. To solve the LPP problem, we propose GMMDA, a generative data augmentation (DA) method based on Gaussian mixture modeling (GMM) of a graph in a latent space. The proposed GMMDA has three phases. First, a novel objective is designed to jointly learn a low-dimensional graph representation and estimate the GMM. The learning is followed by sampling from the GMM, and then the samples are converted hack to the graph as additional nodes. To uphold label preservation, we designed a minimum description length (MDL)-based method to select a set of samples that produces the minimum shift in the data distribution. Through experiments, we demonstrate that GMMDA can improve the performance of graph convolutional network on CORA, CITESEER and MIMED by as much as 7.75% 8.75% and 5.87%, respectively, significantly outperforming the state-of-the-art methods.

引用

页码：319 / 328

页数：10

共 50 条

[41] Modeling Space System Architectures with Graph Theory
Arney, Dale C.
Wilhite, Alan W.
[J]. JOURNAL OF SPACECRAFT AND ROCKETS, 2014, 51 (05) : 1413 - 1429
[42] GRAPH ESTIMATION FOR MATRIX-VARIATE GAUSSIAN DATA
Chen, Xi
Liu, Weidong
[J]. STATISTICA SINICA, 2019, 29 (01) : 479 - 504
[43] Network Traffic Modeling and Prediction Using Graph Gaussian Processes
Mehrizi, Sajad
Chatzinotas, Symeon
[J]. IEEE ACCESS, 2022, 10 : 132644 - 132655
[44] Network Traffic Modeling and Prediction Using Graph Gaussian Processes
Mehrizi, Sajad
Chatzinotas, Symeon
[J]. IEEE Access, 2022, 10 : 132644 - 132655
[45] Neural Gaussian Similarity Modeling for Differential Graph Structure Learning
Fan, Xiaolong
Gong, Maoguo
Wu, Yue
Tang, Zedong
Liu, Jieyi
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 11919 - 11926
[46] Space Limited Graph Algorithms on Big Data
Chen, Jianer
Chu, Zirui
Guo, Ying
Yang, Wei
[J]. COMPUTING AND COMBINATORICS, COCOON 2022, 2022, 13595 : 255 - 267
[47] Differentiated Fashion Recommendation Using Knowledge Graph and Data Augmentation
Yan, Cairong
Chen, Yizhou
Zhou, Lingjie
[J]. IEEE ACCESS, 2019, 7 : 102239 - 102248
[48] Latent Space Modeling of Hypergraph Data
Turnbull, Kathryn
Lunagomez, Simon
Nemeth, Christopher
Airoldi, Edoardo
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023,
[49] Unleashing the Power of Graph Data Augmentation on Covariate Distribution Shift
Sui, Yongduo
Wu, Qitian
Wu, Jiancan
Cui, Qing
Li, Longfei
Zhou, Jun
Wang, Xiang
He, Xiangnan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[50] Backdoor Attacks on Graph Neural Networks Trained with Data Augmentation
Yashiki, Shingo
Takahashi, Chako
Suzuki, Koutarou
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2024, E107A (03) : 355 - 358

← 1 2 3 4 5 →