Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation

被引:0
|
作者
Chen, Ying [1 ]
Qiang, Siwei [1 ]
Ha, Mingming [1 ,2 ]
Liu, Xiaolei [1 ]
Li, Shaoshuai [1 ]
Tong, Jiabi [1 ]
Yuan, Lingfeng [1 ]
Guo, Xiaobo [1 ,3 ]
Zhu, Zhenfeng [3 ]
机构
[1] Ant Grp, MYbank, Hangzhou, Zhejiang, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
[3] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
关键词
Semi-supervised learning; node augmentation; triangle augmentation;
D O I
10.1145/3608953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, semi-supervised graph learning with data augmentation (DA) has been the most commonly used and best-performing method to improve model robustness in sparse scenarios with few labeled samples. However, most existing DA methods are based on the homogeneous graph, but none are specific for the heterogeneous graph. Differing from the homogeneous graph, DA in the heterogeneous graph faces greater challenges: heterogeneity of information requires DA strategies to effectively handle heterogeneous relations, which considers the information contribution of different types of neighbors and edges to the target nodes. Furthermore, over-squashing of information is caused by the negative curvature formed by the nonuniformity distribution and the strong clustering in a complex graph. To address these challenges, this article presents a novel method named HG-MDA (Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation). For the problem of heterogeneity of information in DA, node and topology augmentation strategies are proposed for the characteristics of the heterogeneous graph. Additionally, meta-relation-based attention is applied as one of the indexes for selecting augmented nodes and edges. For the problem of over-squashing of information, triangle-based edge adding and removing are designed to alleviate the negative curvature and bring the gain of topology. Finally, the loss function consists of the cross-entropy loss for labeled data and the consistency regularization for unlabeled data. To effectively fuse the prediction results of various DA strategies, sharpening is used. Existing experiments on public datasets (i.e., ACM, DBLP, and OGB) and the industry dataset MB show that HG-MDA outperforms current SOTA models. Additionally, HG-MDA is applied to user identification in internet finance scenarios, helping the business to add 30% key users, and increase loans and balances by 3.6%, 11.1%, and 9.8%.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Heterogeneous graph contrastive learning with adaptive data augmentation for semi-supervised short text classification
    Wu, Mingqiang
    Xu, Zhuoming
    Zheng, Lei
    EXPERT SYSTEMS, 2025, 42 (02)
  • [2] A comparison of graph-based semi-supervised learning for data augmentation
    de Oliveira, Willian Dihanster G.
    Penatti, Otavio A. B.
    Berton, Lilian
    2020 33RD SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI 2020), 2020, : 264 - 271
  • [3] Semi-supervised semantic segmentation with multi-reliability and multi-level feature augmentation
    Yin, Jianjian
    Zheng, Zhichao
    Pan, Yulu
    Gu, Yanhui
    Chen, Yi
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
  • [4] Semi-Supervised Learning with Data Augmentation for Tabular Data
    Fang, Junpeng
    Tang, Caizhi
    Cui, Qing
    Zhu, Feng
    Li, Longfei
    Zhou, Jun
    Zhu, Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3928 - 3932
  • [5] Robust Semi-Supervised Learning With Multi-Consistency and Data Augmentation
    Guo, Jing-Ming
    Sun, Chi-Chia
    Chan, Kuan-Yu
    Liu, Chun-Yu
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 414 - 424
  • [6] Semi-Supervised Graph Contrastive Learning With Virtual Adversarial Augmentation
    Dong, Yixiang
    Luo, Minnan
    Li, Jundong
    Liu, Ziqi
    Zheng, Qinghua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (08) : 4232 - 4244
  • [7] Data Augmentation for Graph Convolutional Network on Semi-supervised Classification
    Tang, Zhengzheng
    Qiao, Ziyue
    Hong, Xuehai
    Wang, Yang
    Dharejo, Fayaz Ali
    Zhou, Yuanchun
    Du, Yi
    WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 33 - 48
  • [8] Enhanced Graph Neural Network with Multi-Task Learning and Data Augmentation for Semi-Supervised Node Classification
    Fan, Cheng
    Wang, Buhong
    Wang, Zhen
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (12)
  • [9] MLC: Multi-level consistency learning for semi-supervised left atrium segmentation
    Shi, Zhebin
    Jiang, Mingfeng
    Li, Yang
    Wei, Bo
    Wang, Zefeng
    Wu, Yongquan
    Tan, Tao
    Yang, Guang
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 244
  • [10] Semi-supervised Multi-label Learning for Graph-structured Data
    Song, Zixing
    Meng, Ziqiao
    Zhang, Yifei
    King, Irwin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1723 - 1733