FELIX: Automatic and Interpretable Feature Engineering Using LLMs

被引:0
|
作者
Malberg, Simon [1 ]
Mosca, Edoardo [1 ]
Groh, Georg [1 ]
机构
[1] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
关键词
Large Language Models; Natural Language Processing; Feature Engineering; Text Classification;
D O I
10.1007/978-3-031-70359-1_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-processing and feature engineering are essential yet labor-intensive components of NLP. Engineers must often balance the demand for high model accuracy against interpretability, all while having to deal with unstructured data. We address this issue by introducing Feature Engineering with LLMs for Interpretability and Explainability (FELIX), a novel approach harnessing the vast world knowledge embedded in pre-trained Large Language Models (LLMs) to automatically generate a set of features describing the data. These features are human-interpretable, bring structure to text samples, and can be easily leveraged to train downstream classifiers. We test FELIX across five different text classification tasks, showing that it performs better than feature extraction baselines such as TF-IDF and LLM's embeddings as well as s.o.t.a. LLM's zero-shot performance and a fine-tuned text classifier. Further experiments also showcase FELIX's strengths in terms of sample efficiency and generalization capabilities, making it a low-effort and reliable method for automatic and interpretable feature extraction. We release our code and supplementary material: https://github.com/ simonmalberg/felix.
引用
收藏
页码:230 / 246
页数:17
相关论文
共 50 条
  • [21] FlowMind: Automatic Workflow Generation with LLMs
    Zeng, Zhen
    Watson, William
    Cho, Nicole
    Rahimi, Saba
    Reynolds, Shayleen
    Balch, Tucker
    Veloso, Manuela
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 73 - 81
  • [22] Automatic Feature Engineering for Learning Compact Decision Trees
    Roshanski, Inbal
    Kalech, Meir
    Rokach, Lior
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
  • [23] Self-Organizing Transformations for Automatic Feature Engineering
    Rodrigues, Ericks da Silva
    Lima Martins, Denis Mayr
    de Lima Neto, Fernando Buarque
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [24] Integrating LLMs in the Engineering of a SAR Ontology
    Doumanas, Dimitrios
    Soularidis, Andreas
    Kotis, Konstantinos
    Vouros, George
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, PT IV, AIAI 2024, 2024, 714 : 360 - 374
  • [25] Engineering section analysis based on automatic feature points matching in reverse engineering
    Shu, Lingjie
    PROCEEDINGS OF THE 2016 7TH INTERNATIONAL CONFERENCE ON MECHATRONICS, CONTROL AND MATERIALS (ICMCM 2016), 2016, 104 : 691 - 694
  • [26] Interpretable Emotion Classification Using Multidomain Feature of EEG Signals
    Zhao, Kunyuan
    Xu, Dan
    He, Kangjian
    Peng, Guoqin
    IEEE SENSORS JOURNAL, 2023, 23 (11) : 11879 - 11891
  • [27] Automatic velocity analysis using interpretable multimode neural networks
    Zhang, Haifeng
    Yuan, Sanyi
    Zeng, Huahui
    Yuan, Huan
    Gao, Yang
    Wang, Shangxu
    GEOPHYSICAL JOURNAL INTERNATIONAL, 2023, 235 (01) : 216 - 230
  • [28] Towards an Intelligent Test Case Generation Framework Using LLMs and Prompt Engineering
    Boukhlif, Mohamed
    Kharmoum, Nassim
    Hanine, Mohamed
    Kodad, Mohcine
    Lagmiri, Souad Najoua
    ADVANCES IN SMART MEDICAL, IOT & ARTIFICIAL INTELLIGENCE, VOL 2, ICSMAI 2024, 2024, 12 : 24 - 31
  • [29] Automatic Feature Engineering Through Monte Carlo Tree Search
    Huang, Yiran
    Zhou, Yexu
    Hefenbrock, Michael
    Riedel, Till
    Fang, Likun
    Beigl, Michael
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT III, 2023, 13715 : 581 - 598
  • [30] Aircraft bearing fault diagnosis based on automatic feature engineering
    Zhang C.
    Li H.
    Hu H.
    Zhu C.
    Zhang Y.
    Nan G.
    Shu Y.
    Huagong Xuebao/CIESC Journal, 2021, 72 : 430 - 436