Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation

被引:0
|
作者
Chen, Ying [1 ]
Qiang, Siwei [1 ]
Ha, Mingming [1 ,2 ]
Liu, Xiaolei [1 ]
Li, Shaoshuai [1 ]
Tong, Jiabi [1 ]
Yuan, Lingfeng [1 ]
Guo, Xiaobo [1 ,3 ]
Zhu, Zhenfeng [3 ]
机构
[1] Ant Grp, MYbank, Hangzhou, Zhejiang, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
[3] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
关键词
Semi-supervised learning; node augmentation; triangle augmentation;
D O I
10.1145/3608953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, semi-supervised graph learning with data augmentation (DA) has been the most commonly used and best-performing method to improve model robustness in sparse scenarios with few labeled samples. However, most existing DA methods are based on the homogeneous graph, but none are specific for the heterogeneous graph. Differing from the homogeneous graph, DA in the heterogeneous graph faces greater challenges: heterogeneity of information requires DA strategies to effectively handle heterogeneous relations, which considers the information contribution of different types of neighbors and edges to the target nodes. Furthermore, over-squashing of information is caused by the negative curvature formed by the nonuniformity distribution and the strong clustering in a complex graph. To address these challenges, this article presents a novel method named HG-MDA (Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation). For the problem of heterogeneity of information in DA, node and topology augmentation strategies are proposed for the characteristics of the heterogeneous graph. Additionally, meta-relation-based attention is applied as one of the indexes for selecting augmented nodes and edges. For the problem of over-squashing of information, triangle-based edge adding and removing are designed to alleviate the negative curvature and bring the gain of topology. Finally, the loss function consists of the cross-entropy loss for labeled data and the consistency regularization for unlabeled data. To effectively fuse the prediction results of various DA strategies, sharpening is used. Existing experiments on public datasets (i.e., ACM, DBLP, and OGB) and the industry dataset MB show that HG-MDA outperforms current SOTA models. Additionally, HG-MDA is applied to user identification in internet finance scenarios, helping the business to add 30% key users, and increase loans and balances by 3.6%, 11.1%, and 9.8%.
引用
收藏
页数:27
相关论文
共 50 条
  • [31] Graph Representation Learning Model for Multi-Level Feature Augmentation
    Feng, Yao
    Kong, Bing
    Zhou, Lihua
    Bao, Chongming
    Wang, Chongyun
    Computer Engineering and Applications, 2023, 59 (11) : 131 - 140
  • [32] SMGCL: Semi-supervised Multi-view Graph Contrastive Learning
    Zhou, Hui
    Gong, Maoguo
    Wang, Shanfeng
    Gao, Yuan
    Zhao, Zhongying
    KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [33] Multi-view semi-supervised learning with adaptive graph fusion
    Qiang, Qianyao
    Zhang, Bin
    Nie, Feiping
    Wang, Fei
    NEUROCOMPUTING, 2023, 557
  • [34] Fast Multi-View Semi-Supervised Learning With Learned Graph
    Zhang, Bin
    Qiang, Qianyao
    Wang, Fei
    Nie, Feiping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (01) : 286 - 299
  • [35] GRAPH CONVOLUTIONAL NETWORK BASED SEMI-SUPERVISED LEARNING ON MULTI-SPEAKER MEETING DATA
    Tong, Fuchuan
    Zheng, Siqi
    Zhang, Min
    Chen, Yafeng
    Suo, Hongbin
    Hong, Qingyang
    Li, Lin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6622 - 6626
  • [36] Flexible multi-view semi-supervised learning with unified graph
    Li, Zhongheng
    Qiang, Qianyao
    Zhang, Bin
    Wang, Fei
    Nie, Feiping
    NEURAL NETWORKS, 2021, 142 (142) : 92 - 104
  • [37] GRAPH-BASED SEMI-SUPERVISED LEARNING WITH MULTI-LABEL
    Zha, Zheng-Jun
    Mei, Tao
    Wang, Jingdong
    Wang, Zengfu
    Hua, Xian-Sheng
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1321 - +
  • [38] Multi-Graph Based Semi-Supervised Learning for Activity Recognition
    Stikic, Maja
    Larlus, Diane
    Schiele, Bernt
    2009 INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, PROCEEDINGS, 2009, : 85 - 92
  • [39] Inductive Multi-View Semi-supervised Learning with a Consensus Graph
    N. Ziraki
    A. Bosaghzadeh
    F. Dornaika
    Z. Ibrahim
    N. Barrena
    Cognitive Computation, 2023, 15 : 904 - 913
  • [40] Inductive Multi-View Semi-supervised Learning with a Consensus Graph
    Ziraki, N.
    Bosaghzadeh, A.
    Dornaika, F.
    Ibrahim, Z.
    Barrena, N.
    COGNITIVE COMPUTATION, 2023, 15 (03) : 904 - 913