GMMDA: Gaussian Mixture Modeling of Graph in Latent Space for Graph Data Augmentation

被引:0
|
作者
Li, Yanjin [1 ]
Xu, Linchuan [1 ,2 ]
Yamanishi, Kenji [1 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Math Informat, Tokyo, Japan
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
关键词
graph data augmentation; graph neural networks; Gaussian mixture model; semi-supervised learning; minimum description length principle; NETWORKS;
D O I
10.1109/ICDM58522.2023.00041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph data augmentation (GDA), which manipulates graph structure and/or attributes, has been demonstrated as an effective method for improving the generalization of graph neural networks on semi -supervised node classification. As a data augmentation technique, label -preservation is critical, that is, node labels should not change after data manipulation. However, most existing methods overlook the label-preservation requirements. Determining the label-preserving nature of a GDA method is highly challenging, owing to the non-Euclidean nature of the graph structure. In this study, for the first time, we formulate a label-preserving problem (LPP) in the context of GDA. The LPP is formulated as an optimization problem in which, given a fixed augmentation budget, the objective is to find an augmented graph with minimal difference in data distribution compared to the original graph. To solve the LPP problem, we propose GMMDA, a generative data augmentation (DA) method based on Gaussian mixture modeling (GMM) of a graph in a latent space. The proposed GMMDA has three phases. First, a novel objective is designed to jointly learn a low-dimensional graph representation and estimate the GMM. The learning is followed by sampling from the GMM, and then the samples are converted hack to the graph as additional nodes. To uphold label preservation, we designed a minimum description length (MDL)-based method to select a set of samples that produces the minimum shift in the data distribution. Through experiments, we demonstrate that GMMDA can improve the performance of graph convolutional network on CORA, CITESEER and MIMED by as much as 7.75% 8.75% and 5.87%, respectively, significantly outperforming the state-of-the-art methods.
引用
收藏
页码:319 / 328
页数:10
相关论文
共 50 条
  • [1] GMMDA: Gaussian mixture modeling of graph in latent space for graph data augmentation
    Li, Yanjin
    Xu, Linchuan
    Yamanishi, Kenji
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024,
  • [2] Multichannel Adaptive Data Mixture Augmentation for Graph Neural Networks
    Ye, Zhonglin
    Zhou, Lin
    Li, Mingyuan
    Zhang, Wei
    Liu, Zhen
    Zhao, Haixing
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2024, 20 (01)
  • [3] Graph Contrastive Learning with Constrained Graph Data Augmentation
    Xu, Shaowu
    Wang, Luo
    Jia, Xibin
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (08) : 10705 - 10726
  • [4] Graph Contrastive Learning with Constrained Graph Data Augmentation
    Shaowu Xu
    Luo Wang
    Xibin Jia
    [J]. Neural Processing Letters, 2023, 55 : 10705 - 10726
  • [5] Data Augmentation for Graph Classification
    Zhou, Jiajun
    Shen, Jie
    Xuan, Qi
    [J]. CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2341 - 2344
  • [6] Unsupervised Graph Structure Learning Based on Optimal Graph Topology Modeling and Adaptive Data Augmentation
    An, Dongdong
    Pan, Zongxu
    Zhao, Qin
    Liu, Wenyan
    Liu, Jing
    [J]. MATHEMATICS, 2024, 12 (13)
  • [7] G-Mixup: Graph Data Augmentation for Graph Classification
    Han, Xiaotian
    Jiang, Zhimeng
    Liu, Ninghao
    Hu, Xia
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Data Augmentation for Graph Neural Networks
    Zhao, Tong
    Liu, Yozen
    Neves, Leonardo
    Woodford, Oliver
    Jiang, Meng
    Shah, Neil
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11015 - 11023
  • [9] Graph Data Augmentation for Node Classification
    Wei, Ziyu
    Xiao, Xi
    Zhang, Bin
    Hu, Guangwu
    Li, Qing
    Xia, Shutao
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4899 - 4905
  • [10] GRAPH-BASED METHOD BASED ON GAUSSIAN MIXTURE MODELING TO CLASSIFY AGRICULTURAL LANDS
    Ok, Ali Ozgun
    Ok, Asli Ozdarici
    Schindler, Konrad
    [J]. 2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 425 - 428