Multi-Relational Graph Representation Learning for Financial Statement Fraud Detection

被引:0
|
作者
Wang, Chenxu [1 ,2 ]
Wang, Mengqin [1 ]
Wang, Xiaoguang [1 ]
Zhang, Luyue [1 ]
Long, Yi [3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, MoE Key Lab Intelligent Networks & Network Secur, Xian 710049, Peoples R China
[3] Chinese Univ Hong Kong Shenzhen CUHK Shenzhen, Shenzhen Finance Inst, Shenzhen, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 03期
基金
中国国家自然科学基金;
关键词
financial statement fraud; class imbalance; Graph Neural Networks (GNN); multi-relational graphs; CORPORATE FRAUD; MORLET WAVELET; NEURAL-NETWORK; VECTOR MACHINE; DECISION TREE; DESIGN; FOREST;
D O I
10.26599/BDMA.2024.9020013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Financial statement fraud refers to malicious manipulations of financial data in listed companies' annual statements. Traditional machine learning approaches focus on individual companies, overlooking the interactive relationships among companies that are crucial for identifying fraud patterns. Moreover, fraud detection is a typical imbalanced binary classification task with normal samples outnumbering fraud ones. In this paper, we propose a multi-relational graph convolutional network, named FraudGCN, for detecting financial statement fraud. A multi-relational graph is constructed to integrate industrial, supply chain, and accounting-sharing relationships, effectively encapsulating the multidimensional and complex interactions among companies. We then develop a multi-relational graph convolutional network to aggregate information within each relationship and employ an attention mechanism to fuse information across multiple relationships. The attention mechanism enables the model to distinguish the importance of different relationships, thereby aggregating more useful information from key relationships. To alleviate the class imbalance problem, we present a diffusion-based under-sampling strategy that strategically selects key nodes globally for model training. We also employ focal loss to assign greater weights to harder-to-classify minority samples. We build a real-world dataset from the annual financial statement of listed companies in China. The experimental results show that FraudGCN achieves an improvement of 3.15% in Macro-recall, 3.36% in Macro-F1, and 3.86% in GMean compared to the second-best method. The dataset and codes are publicly available at: https://github.com/XNetLab/MRG-for-Finance.
引用
收藏
页码:920 / 941
页数:22
相关论文
共 50 条
  • [1] Multi-relational dynamic graph representation learning
    Duan, Pingtao
    Ren, Xiangsheng
    Liu, Yuting
    [J]. NEUROCOMPUTING, 2023, 558
  • [2] Multi-Relational Graph Representation Learning with Bayesian Gaussian Process Network
    Chen, Guanzheng
    Fang, Jinyuan
    Meng, Zaiqiao
    Zhang, Qiang
    Liang, Shangsong
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 5530 - 5538
  • [3] Financial Feature Embedding with Knowledge Representation Learning for Financial Statement Fraud Detection
    Shen, Yuming
    Guo, Caichan
    Li, Huan
    Chen, Junjie
    Guo, Yunchuan
    Qiu, Xinying
    [J]. 2020 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS (IIKI2020), 2021, 187 : 420 - 425
  • [4] Fashion Recommendation with Multi-relational Representation Learning
    Li, Yang
    Luo, Yadan
    Huang, Zi
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 3 - 15
  • [5] A Structural Representation Learning for Multi-relational Networks
    Liu, Lin
    Li, Xin
    Cheung, William K.
    Xu, Chengcheng
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4047 - 4053
  • [6] Mixed Multi-relational Representation Learning for Low-Dimensional Knowledge Graph Embedding
    Thanh Le
    Chi Tran
    Bac Le
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 428 - 441
  • [7] Cross-Graph Learning of Multi-Relational Associations
    Liu, Hanxiao
    Yang, Yiming
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [8] Machine Learning Detection for Financial Statement Fraud
    Hwang, Ting-Kai
    Chen, Wei-Chun
    Chiang, Wan-Chi
    Li, Yung-Ming
    [J]. INFORMATION SYSTEMS AND TECHNOLOGIES, WORLDCIST 2022, VOL 2, 2022, 469 : 148 - 154
  • [9] Tensor Graph Convolutional Networks for Multi-Relational and Robust Learning
    Ioannidis, Vassilis N.
    Marques, Antonio G.
    Giannakis, Georgios B.
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 6535 - 6546
  • [10] Inductive Graph Representation Learning for fraud detection
    Van Belle, Rafael
    Van Damme, Charles
    Tytgat, Hendrik
    De Weerdt, Jochen
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193