FELIX: Automatic and Interpretable Feature Engineering Using LLMs

被引:0
|
作者
Malberg, Simon [1 ]
Mosca, Edoardo [1 ]
Groh, Georg [1 ]
机构
[1] Tech Univ Munich, Sch Computat Informat & Technol, Munich, Germany
关键词
Large Language Models; Natural Language Processing; Feature Engineering; Text Classification;
D O I
10.1007/978-3-031-70359-1_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-processing and feature engineering are essential yet labor-intensive components of NLP. Engineers must often balance the demand for high model accuracy against interpretability, all while having to deal with unstructured data. We address this issue by introducing Feature Engineering with LLMs for Interpretability and Explainability (FELIX), a novel approach harnessing the vast world knowledge embedded in pre-trained Large Language Models (LLMs) to automatically generate a set of features describing the data. These features are human-interpretable, bring structure to text samples, and can be easily leveraged to train downstream classifiers. We test FELIX across five different text classification tasks, showing that it performs better than feature extraction baselines such as TF-IDF and LLM's embeddings as well as s.o.t.a. LLM's zero-shot performance and a fine-tuned text classifier. Further experiments also showcase FELIX's strengths in terms of sample efficiency and generalization capabilities, making it a low-effort and reliable method for automatic and interpretable feature extraction. We release our code and supplementary material: https://github.com/ simonmalberg/felix.
引用
收藏
页码:230 / 246
页数:17
相关论文
共 50 条
  • [11] Automatic Feature Engineering by Deep Reinforcement Learning
    Zhang, Jianyu
    Hao, Jianye
    Fogelman-Soulie, Francoise
    Wang, Zan
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2312 - 2314
  • [12] Automatic feature recognition from engineering drawings
    C. F. You
    S. S. Yang
    The International Journal of Advanced Manufacturing Technology, 1998, 14 : 495 - 507
  • [13] Interpretable Feature Generation in ECG Using a Variational Autoencoder
    Kuznetsov, V. V.
    Moskalenko, V. A.
    Gribanov, D. V.
    Zolotykh, Nikolai Yu.
    FRONTIERS IN GENETICS, 2021, 12
  • [14] Interpretable Feature Learning of Graphs using Tensor Decomposition
    Hamdi, Shah Muhammad
    Angryk, Rafal
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 270 - 279
  • [15] AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design
    Qiu, Ruidi
    Li Zhang, Grace
    Drechsler, Rolf
    Schlichtmann, Ulf
    Li, Bing
    PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,
  • [16] ELLA: Empowering LLMs for Interpretable, Accurate and Informative Legal Advice
    Hu, Yutong
    Luo, Kangcheng
    Feng, Yansong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 374 - 387
  • [17] Interpretable Feature Learning of Graphs using Tensor Decomposition
    Hamdi, Shah Muhammad
    Angryk, Rafal
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 658 - 667
  • [18] Advancing aircraft engine RUL predictions: an interpretable integrated approach of feature engineering and aggregated feature importance
    Alomari, Yazan
    Ando, Matyas
    Baptista, Marcia L.
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [19] An automatic sediment-facies classification approach using machine learning and feature engineering
    Lee, An-Sheng
    Enters, Dirk
    Huang, Jyh-Jaan Steven
    Liou, Sofia Ya Hsuan
    Zolitschka, Bernd
    COMMUNICATIONS EARTH & ENVIRONMENT, 2022, 3 (01):
  • [20] An automatic sediment-facies classification approach using machine learning and feature engineering
    An-Sheng Lee
    Dirk Enters
    Jyh-Jaan Steven Huang
    Sofia Ya Hsuan Liou
    Bernd Zolitschka
    Communications Earth & Environment, 3