On the combination of graph data for assessing thin-file borrowers' creditworthiness

被引:9
|
作者
Munoz-Cancino, Ricardo [1 ]
Bravo, Cristian [2 ]
Rios, Sebastian A. [1 ]
Grana, Manuel [3 ]
机构
[1] Univ Chile, Business Intelligence Res Ctr CEINE, Ind Engn Dept, Beauchef 851, Santiago 8370456, Chile
[2] Univ Western Ontario, Dept Stat & Actuarial Sci, 1151 Richmond St, London, ON N6A 5B7, Canada
[3] Univ Basque Country, Computat Intelligence Grp, San Sebastian 20018, Spain
基金
加拿大自然科学与工程研究理事会;
关键词
Credit scoring; Machine learning; Social network analysis; Network data; Graph neural networks; FEATURE-SELECTION; CREDIT; PREDICTION; MODEL;
D O I
10.1016/j.eswa.2022.118809
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thin-file borrowers are customers for whom a creditworthiness assessment is uncertain due to their lack of credit history. To address missing credit information, many researchers have used borrowers' social interactions as an alternative data source. Exploiting social networking data has traditionally been achieved by hand-crafted feature engineering, but lately, graph neural networks have emerged as a promising alternative. Here we introduce an information-processing framework to improve credit scoring models by blending several methods of graph representation learning: feature engineering, graph embeddings, and graph neural networks. In this approach, we aggregate the methods' outputs to be fed to a gradient boosting classifier to produce a final creditworthiness score. We have validated this framework over a unique multi-source dataset that characterizes the relationships, interactions, and credit history for the entire population of a Latin American country, applying it to credit risk models, application, and behavior. It also allows us to study both individuals and companies. Our results show that the methods of graph representation learning should be used as complements; they should not be seen as self-sufficient methods, as it is currently done. We improve the creditworthiness assessment performance in terms of the measures of Area Under the ROC Curve (AUC) and Kolmogorov- Smirnov (KS), outperforming traditional methods of exploiting social interaction data. In the area of corporate lending, where the potential gain is much higher, our results confirm that the evaluation of a thin-file company cannot solely consider the company's own characteristics. The business ecosystem in which these companies interact with their owners, suppliers, customers, and other companies provides novel knowledge that enables financial institutions to enhance their creditworthiness assessment. Our results let us know when and on which population to use graph data and the expected effects on performance. They also show the enormous value of graph data on the credit scoring problem for thin-file borrowers, mainly to help companies with thin or no credit history to enter the financial system.
引用
收藏
页数:15
相关论文
共 34 条
  • [1] A big data analytics method for assessing creditworthiness of SMEs: fuzzy equifinality relationships analysis
    Shi, Baofeng
    Bai, Chunguang
    Dong, Yizhe
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [2] Network and graph markup language (NaGML) - Data file formats
    Bradley, GH
    NEXT WAVE IN COMPUTING, OPTIMIZATION, AND DECISION TECHNOLOGIES, 2005, 29 : 249 - 266
  • [3] Development of a method for assessing operating room management based on diagnosis procedure combination E- and F-file data
    M Tanaka
    M Sekimoto
    Y Imanaka
    BMC Health Services Research, 9 (Suppl 1)
  • [4] Combination of Rivest-Shamir-Adleman Algorithm and End of File Method for Data Security
    Rachmawati, Dian
    Amalia, Amalia
    Elviwani
    2ND INTERNATIONAL CONFERENCE ON SCIENCE (ICOS), 2018, 979
  • [5] Assessing the impact of data augmentation and a combination of CNNs on leukemia classification
    Claro, Maila L.
    Veras, Rodrigo de M. S.
    Santana, Andre M.
    Vogado, Luis Henrique S.
    Braz Junior, Geraldo
    de Medeiros, Fatima N. S.
    Tavares, Joao Manuel R. S.
    INFORMATION SCIENCES, 2022, 609 : 1010 - 1029
  • [6] Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination
    Kallus, Nathan
    Mao, Xiaojie
    Zhou, Angela
    MANAGEMENT SCIENCE, 2022, 68 (03) : 1959 - 1981
  • [7] Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination
    Kallus, Nathan
    Mao, Xiaojie
    Zhou, Angela
    FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, : 110 - 110
  • [8] QMDS: a file system metadata management service supporting a graph data model-based query language
    Ames, Sasha
    Gokhale, Maya
    Maltzahn, Carlos
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2013, 28 (02) : 159 - 183
  • [9] Assessing the Thickness of Thin Films Based on Elemental Data Composition of Film Structures
    Yu. M. Nikolaenko
    A. S. Korneevets
    N. B. Efros
    V. V. Burkhovetskii
    I. Yu. Reshidova
    Technical Physics Letters, 2019, 45 : 679 - 682
  • [10] Assessing the Thickness of Thin Films Based on Elemental Data Composition of Film Structures
    Nikolaenko, Yu M.
    Korneevets, A. S.
    Efros, N. B.
    Burkhovetskii, V. V.
    Reshidova, I. Yu
    TECHNICAL PHYSICS LETTERS, 2019, 45 (07) : 679 - 682