TLTD: Transfer Learning for Tabular Data

被引:7
|
作者
Bragilovski, Maxim [1 ]
Kapri, Zahi [1 ]
Rokach, Lior [1 ]
Levy-Tzedek, Shelly [2 ,3 ,4 ]
机构
[1] Ben Gurion Univ Negev, Dept Software & Informat Syst Engn, IL-8410501 Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Recanati Sch Community Hlth Profess, Dept Phys Therapy, Beer Sheva, Israel
[3] Ben Gurion Univ Negev, Zlotowski Ctr Neurosci, Beer Sheva, Israel
[4] Univ Freiburg, Freiburg Inst Adv Studies FRIAS, Freiburg, Germany
关键词
Deep-learning; Feature-extraction; Tabular datasets;
D O I
10.1016/j.asoc.2023.110748
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) have become effective for various machine learning tasks. DNNs are known to achieve high accuracy with unstructured data in which each data sample (e.g., image) consists of many raw features (e.g., pixels) of the same type. The effectiveness of this approach diminishes for structured (tabular) data. In most cases, decision tree-based models such as Random Forest (RF) or Gradient Boosting Decision Trees (GBDT) outperform DNNs. In addition, DNNs tend to perform poorly when the number of samples in the dataset is small. This paper introduces Transfer Learning for Tabular Data (TLTD) which utilizes a novel learning architecture designed to extract new features from structured datasets. Using the DNN's learning capabilities on images, we convert the tabular data into images, then use the distillation technique to achieve better learning. We evaluated our approach with 25 structured datasets, and compared the outcomes to those of RF, eXtreme Gradient Boosting (XGBoost), Tabnet, KNN, and TabPFN. The results demonstrate the usefulness of the TLTD approach.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Graph Neural Network contextual embedding for Deep Learning on tabular data
    Villaizan-Vallelado, Mario
    Salvatori, Matteo
    Carro, Belen
    Sanchez-Esguevillas, Antonio Javier
    NEURAL NETWORKS, 2024, 173
  • [42] Updating surrogate models in early building design via tabular transfer learning
    Hinkle, Laura E.
    Brown, Nathan C.
    BUILDING AND ENVIRONMENT, 2025, 267
  • [43] CHISEL: Sculpting Tabular and Non-Tabular Data on the Web
    Doleschal, Johannes
    Hoellerich, Nico
    Martens, Wim
    Neven, Frank
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 139 - 142
  • [44] Mind the Data, Measuring the Performance Gap Between Tree Ensembles and Deep Learning on Tabular Data
    Karlsson, Axel
    Wang, Tianze
    Nowaczyk, Slawomir
    Pashami, Sepideh
    Asadi, Sahar
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT I, IDA 2024, 2024, 14641 : 65 - 76
  • [45] Automatic Machine Learning-Based OLAP Measure Detection for Tabular Data
    Yang, Yuzhao
    Abdelhedi, Fatma
    Darmont, Jerome
    Ravat, Franck
    Teste, Olivier
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2022, 2022, 13428 : 173 - 188
  • [46] SET: Searching Effective Supervised Learning Augmentations in Large Tabular Data Repositories
    Liu, Jiaxiang
    Huang, Zezhou
    Wu, Eugene
    FIRST WORKSHOP ON GOVERNANCE, UNDERSTANDING, AND INTEGRATION OF DATA FOR EFFECTIVE AND RESPONSIBLE AI, GUIDE-AI 2024, 2024, : 26 - 31
  • [47] Federated Learning for Tabular Data using TabNet: A Vehicular Use-Case
    Lindskog, William
    Prehofer, Christian
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING, ICCP, 2022, : 105 - 111
  • [48] A machine-learning approach to automatic detection of delimiters in tabular data files
    Saurav, Shitesh
    Schwarz, Peter
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 1501 - 1503
  • [49] A survey on self-supervised learning for non-sequential tabular data
    Wang, Wei-Yao
    Du, Wei-Wei
    Xu, Derek
    Wang, Wei
    Peng, Wen-Chih
    MACHINE LEARNING, 2025, 114 (01)
  • [50] SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning
    Ucar, Talip
    Hajiramezanali, Ehsan
    Edwards, Lindsay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,