DeepTLF: robust deep neural networks for heterogeneous tabular data

被引：7

作者：

Borisov, Vadim ^{[1
]}

Broelemann, Klaus ^{[2
]}

Kasneci, Enkelejda ^{[1
]}

Kasneci, Gjergji ^{[1
,2
]}

机构：

[1] Univ Tubingen, Tubingen, Germany

[2] SCHUFA Holding AG, Wiesbaden, Germany

来源：

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS | 2023年 / 16卷 / 01期

关键词：

Deep neural networks; Heterogeneous data; Tabular data; Tabular data encoding; Multimodal learning;

D O I：

10.1007/s41060-022-00350-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although deep neural networks (DNNs) constitute the state of the art in many tasks based on visual, audio, or text data, their performance on heterogeneous, tabular data is typically inferior to that of decision tree ensembles. To bridge the gap between the difficulty of DNNs to handle tabular data and leverage the flexibility of deep learning under input heterogeneity, we propose DeepTLF, a framework for deep tabular learning. The core idea of our method is to transform the heterogeneous input data into homogeneous data to boost the performance of DNNs considerably. For the transformation step, we develop a novel knowledge distillations approach, TreeDrivenEncoder, which exploits the structure of decision trees trained on the available heterogeneous data to map the original input vectors onto homogeneous vectors that a DNN can use to improve the predictive performance. Within the proposed framework, we also address the issue of the multimodal learning, since it is challenging to apply decision tree ensemble methods when other data modalities are present. Through extensive and challenging experiments on various real-world datasets, we demonstrate that the DeepTLF pipeline leads to higher predictive performance. On average, our framework shows 19.6% performance improvement in comparison to DNNs. The DeepTLF code is publicly available.

引用

页码：85 / 100

页数：16

共 50 条

[21] Relevance aggregation for neural networks interpretability and knowledge discovery on tabular data
Grisci, Bruno Iochins
Krause, Mathias J.
Dorn, Marcio
INFORMATION SCIENCES, 2021, 559 : 111 - 129
[22] Robust Test Selection for Deep Neural Networks
Sun, Weifeng
Yan, Meng
Liu, Zhongxin
Lo, David
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (12) : 5250 - 5278
[23] Robust Large Margin Deep Neural Networks
Sokolic, Jure
Giryes, Raja
Sapiro, Guillermo
Rodrigues, Miguel R. D.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (16) : 4265 - 4280
[24] Robust learning of parsimonious deep neural networks
Guenter, Valentin Frank Ingmar
Sideris, Athanasios
NEUROCOMPUTING, 2024, 566
[25] Towards robust explanations for deep neural networks
Dombrowski, Ann-Kathrin
Anders, Christopher J.
Mueller, Klaus-Robert
Kessel, Pan
PATTERN RECOGNITION, 2022, 121
[26] Towards Robust Deep Neural Networks with BANG
Rozsa, Andras
Gunther, Manuel
Boult, Terrance E.
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 803 - 811
[27] Quality Robust Mixtures of Deep Neural Networks
Dodge, Samuel F.
Karam, Lina J.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5553 - 5562
[28] Using deep neural networks with heterogeneous chemical data to support phenotypic assay campaigns
de Leon, Antonio de la Vega
Gillet, Val
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 254
[29] Making Deep Neural Networks Robust to Label Noise: A Reweighting Loss and Data Filtration
Zhang, Zhengwen
Li, Yan
Li, Yunjie
Qin, Ying
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 289 - 293
[30] Detecting Data-Driven Robust Statistical Arbitrage Strategies with Deep Neural Networks
Neufeld, Ariel
Sester, Julian
Yin, Daiying
SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2024, 15 (02): : 436 - 472

← 1 2 3 4 5 →