TLTD: Transfer Learning for Tabular Data

被引：7

作者：

Bragilovski, Maxim ^{[1
]}

Kapri, Zahi ^{[1
]}

Rokach, Lior ^{[1
]}

Levy-Tzedek, Shelly ^{[2
,3
,4
]}

机构：

[1] Ben Gurion Univ Negev, Dept Software & Informat Syst Engn, IL-8410501 Beer Sheva, Israel

[2] Ben Gurion Univ Negev, Recanati Sch Community Hlth Profess, Dept Phys Therapy, Beer Sheva, Israel

[3] Ben Gurion Univ Negev, Zlotowski Ctr Neurosci, Beer Sheva, Israel

[4] Univ Freiburg, Freiburg Inst Adv Studies FRIAS, Freiburg, Germany

来源：

APPLIED SOFT COMPUTING | 2023年 / 147卷

关键词：

Deep-learning; Feature-extraction; Tabular datasets;

D O I：

10.1016/j.asoc.2023.110748

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep Neural Networks (DNNs) have become effective for various machine learning tasks. DNNs are known to achieve high accuracy with unstructured data in which each data sample (e.g., image) consists of many raw features (e.g., pixels) of the same type. The effectiveness of this approach diminishes for structured (tabular) data. In most cases, decision tree-based models such as Random Forest (RF) or Gradient Boosting Decision Trees (GBDT) outperform DNNs. In addition, DNNs tend to perform poorly when the number of samples in the dataset is small. This paper introduces Transfer Learning for Tabular Data (TLTD) which utilizes a novel learning architecture designed to extract new features from structured datasets. Using the DNN's learning capabilities on images, we convert the tabular data into images, then use the distillation technique to achieve better learning. We evaluated our approach with 25 structured datasets, and compared the outcomes to those of RF, eXtreme Gradient Boosting (XGBoost), Tabnet, KNN, and TabPFN. The results demonstrate the usefulness of the TLTD approach.& COPY; 2023 Elsevier B.V. All rights reserved.

引用

页数：12

共 50 条

[21] Tabular data
Naomi Altman
Martin Krzywinski
Nature Methods, 2017, 14 (4) : 329 - 330
[22] Learning Model-Agnostic Counterfactual Explanations for Tabular Data
Pawelczyk, Martin
Broelemann, Klaus
Kasneci, Gjergji
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 3126 - 3132
[23] Attention versus contrastive learning of tabular data: a data-centric benchmarking
Rabbani, Shourav B.
Medri, Ivan V.
Samad, Manar D.
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
[24] Picket: guarding against corrupted data in tabular data during learning and inference
Zifan Liu
Zhechun Zhou
Theodoros Rekatsinas
The VLDB Journal, 2022, 31 : 927 - 955
[25] Federated Learning for Tabular Data: Exploring Potential Risk to Privacy
Wu, Han
Zhao, Zilong
Chen, Lydia Y.
Van Moorsel, Aad
2022 IEEE 33RD INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2022), 2022, : 193 - 204
[26] Demystifying Statistics and Machine Learning in Analysis of Structured Tabular Data
Khosravi, Bardia
Weston, Alexander D.
Nugen, Fred
Mickley, John P.
Kremers, Hilal Maradit
Wyles, Cody C.
Carter, Rickey E.
Taunton, Michael J.
JOURNAL OF ARTHROPLASTY, 2023, 38 (10): : 1943 - 1947
[27] HYTREL: Hypergraph-enhanced Tabular Data Representation Learning
Chen, Pei
Sarkar, Soumajyoti
Lausen, Leonard
Srinivasan, Balasubramaniam
Zha, Sheng
Huang, Ruihong
Karypis, George
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[28] Deep clustering of tabular data by weighted Gaussian distribution learning
Rabbani, Shourav B.
Medri, Ivan, V
Samad, Manar D.
NEUROCOMPUTING, 2025, 623
[29] Picket: guarding against corrupted data in tabular data during learning and inference
Liu, Zifan
Zhou, Zhechun
Rekatsinas, Theodoros
VLDB JOURNAL, 2022, 31 (05): : 927 - 955
[30] Learning Enhanced Representations for Tabular Data via Neighborhood Propagation
Du, Kounianhua
Zhang, Weinan
Zhou, Ruiwen
Wang, Yangkun
Zhao, Xilong
Jin, Jiarui
Gan, Quan
Zhang, Zheng
Wipf, David
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

← 1 2 3 4 5 →