FELIX: Automatic and Interpretable Feature Engineering Using LLMs

被引：0

作者：

Malberg, Simon ^{[1
]}

Mosca, Edoardo ^{[1
]}

Groh, Georg ^{[1
]}

机构：

[1] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT IV, ECML PKDD 2024 | 2024年 / 14944卷

关键词：

Large Language Models; Natural Language Processing; Feature Engineering; Text Classification;

D O I：

10.1007/978-3-031-70359-1_14

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-processing and feature engineering are essential yet labor-intensive components of NLP. Engineers must often balance the demand for high model accuracy against interpretability, all while having to deal with unstructured data. We address this issue by introducing Feature Engineering with LLMs for Interpretability and Explainability (FELIX), a novel approach harnessing the vast world knowledge embedded in pre-trained Large Language Models (LLMs) to automatically generate a set of features describing the data. These features are human-interpretable, bring structure to text samples, and can be easily leveraged to train downstream classifiers. We test FELIX across five different text classification tasks, showing that it performs better than feature extraction baselines such as TF-IDF and LLM's embeddings as well as s.o.t.a. LLM's zero-shot performance and a fine-tuned text classifier. Further experiments also showcase FELIX's strengths in terms of sample efficiency and generalization capabilities, making it a low-effort and reliable method for automatic and interpretable feature extraction. We release our code and supplementary material: https://github.com/ simonmalberg/felix.

引用

页码：230 / 246

页数：17

共 50 条

[21] FlowMind: Automatic Workflow Generation with LLMs
Zeng, Zhen
Watson, William
Cho, Nicole
Rahimi, Saba
Reynolds, Shayleen
Balch, Tucker
Veloso, Manuela
PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 73 - 81
[22] Automatic Feature Engineering for Learning Compact Decision Trees
Roshanski, Inbal
Kalech, Meir
Rokach, Lior
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
[23] Self-Organizing Transformations for Automatic Feature Engineering
Rodrigues, Ericks da Silva
Lima Martins, Denis Mayr
de Lima Neto, Fernando Buarque
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[24] Integrating LLMs in the Engineering of a SAR Ontology
Doumanas, Dimitrios
Soularidis, Andreas
Kotis, Konstantinos
Vouros, George
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT IV, AIAI 2024, 2024, 714 : 360 - 374
[25] Engineering section analysis based on automatic feature points matching in reverse engineering
Shu, Lingjie
PROCEEDINGS OF THE 2016 7TH INTERNATIONAL CONFERENCE ON MECHATRONICS, CONTROL AND MATERIALS (ICMCM 2016), 2016, 104 : 691 - 694
[26] Interpretable Emotion Classification Using Multidomain Feature of EEG Signals
Zhao, Kunyuan
Xu, Dan
He, Kangjian
Peng, Guoqin
IEEE SENSORS JOURNAL, 2023, 23 (11) : 11879 - 11891
[27] Automatic velocity analysis using interpretable multimode neural networks
Zhang, Haifeng
Yuan, Sanyi
Zeng, Huahui
Yuan, Huan
Gao, Yang
Wang, Shangxu
GEOPHYSICAL JOURNAL INTERNATIONAL, 2023, 235 (01) : 216 - 230
[28] Towards an Intelligent Test Case Generation Framework Using LLMs and Prompt Engineering
Boukhlif, Mohamed
Kharmoum, Nassim
Hanine, Mohamed
Kodad, Mohcine
Lagmiri, Souad Najoua
ADVANCES IN SMART MEDICAL, IOT & ARTIFICIAL INTELLIGENCE, VOL 2, ICSMAI 2024, 2024, 12 : 24 - 31
[29] Automatic Feature Engineering Through Monte Carlo Tree Search
Huang, Yiran
Zhou, Yexu
Hefenbrock, Michael
Riedel, Till
Fang, Likun
Beigl, Michael
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT III, 2023, 13715 : 581 - 598
[30] Aircraft bearing fault diagnosis based on automatic feature engineering
Zhang C.
Li H.
Hu H.
Zhu C.
Zhang Y.
Nan G.
Shu Y.
Huagong Xuebao/CIESC Journal, 2021, 72 : 430 - 436

← 1 2 3 4 5 →