Relationalizing Tables with Large Language Models: The Promise and Challenges

被引:0
|
作者
Huang, Zezhou [1 ]
Wu, Eugene [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Columbia Univ, DSI, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Large Language Model; Data Transformation; Prompt Engineering; Data Management;
D O I
10.1109/ICDEW61823.2024.00045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables in the wild are usually not relationalized, making querying them difficult. To relationalize tables, recent works designed seven transformation operators, and deep neural networks were adopted to automatically find the sequence of operators, achieving an accuracy of 57.0%. In comparison, earlier versions of large language models like GPT-3.5 only reached 13.1%. However, these results were obtained using naive prompts. Furthermore, GPT-4 is recently available, which is substantially larger and more performant. This study examines how the selection of models, specifically GPT-3.5 and GPT-4, and various prompting strategies, such as Chain-of-Thought and task decomposition, affect accuracy. The main finding is that GPT-4, combined with Task Decomposition and Chain-of-Thought, attains a remarkable accuracy of 74.6%. Further analysis of errors made by GPT-4 shows the challenges that about half of the errors are not due to the model's shortcomings, but rather to ambiguities in the benchmarks. When these benchmarks are disambiguated, GPT-4's accuracy improves to 86.9%.
引用
收藏
页码:305 / 309
页数:5
相关论文
共 50 条
  • [21] Challenges in applying large language models to requirements engineering tasks
    Norheim, Johannes J.
    Rebentisch, Eric
    Xiao, Dekai
    Draeger, Lorenz
    Kerbrat, Alain
    de Weck, Olivier L.
    [J]. DESIGN SCIENCE, 2024, 10
  • [23] Large language models in health care: Development, applications, and challenges
    Yang, Rui
    Tan, Ting Fang
    Lu, Wei
    Thirunavukarasu, Arun James
    Ting, Daniel Shu Wei
    Liu, Nan
    [J]. HEALTH CARE SCIENCE, 2023, 2 (04): : 255 - 263
  • [24] Large Language Models for Business Process Management: Opportunities and Challenges
    Vidgof, Maxim
    Bachhofner, Stefan
    Mendling, Jan
    [J]. BUSINESS PROCESS MANAGEMENT FORUM, BPM 2023 FORUM, 2023, 490 : 107 - 123
  • [25] Large Language Models and Future of Information Retrieval: Opportunities and Challenges
    Zhai, ChengXiang
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 481 - 490
  • [26] ChatGPT for good? On opportunities and challenges of large language models for education
    Kasneci, Enkelejda
    Sessler, Kathrin
    Kuechemann, Stefan
    Bannert, Maria
    Dementieva, Daryna
    Fischer, Frank
    Gasser, Urs
    Groh, Georg
    Guennemann, Stephan
    Huellermeier, Eyke
    Krusche, Stepha
    Kutyniok, Gitta
    Michaeli, Tilman
    Nerdel, Claudia
    Pfeffer, Juergen
    Poquet, Oleksandra
    Sailer, Michael
    Schmidt, Albrecht
    Seidel, Tina
    Stadler, Matthias
    Weller, Jochen
    Kuhn, Jochen
    Kasneci, Gjergji
    [J]. LEARNING AND INDIVIDUAL DIFFERENCES, 2023, 103
  • [27] Opportunities and challenges for ChatGPT and large language models in biomedicine and health
    Tian, Shubo
    Jin, Qiao
    Yeganova, Lana
    Lai, Po-Ting
    Zhu, Qingqing
    Chen, Xiuying
    Yang, Yifan
    Chen, Qingyu
    Kim, Won
    Comeau, Donald C.
    Islamaj, Rezarta
    Kapoor, Aadit
    Gao, Xin
    Lu, Zhiyong
    [J]. BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [28] Navigating Challenges and Technical Debt in Large Language Models Deployment
    Menshawy, Ahmed
    Nawaz, Zeeshan
    Fahmy, Mahmoud
    [J]. PROCEEDINGS OF THE 2024 4TH WORKSHOP ON MACHINE LEARNING AND SYSTEMS, EUROMLSYS 2024, 2024, : 192 - 199
  • [29] The rise of large language models: challenges for Critical Discourse Studies
    Gillings, Mathew
    Kohn, Tobias
    Mautner, Gerlinde
    [J]. CRITICAL DISCOURSE STUDIES, 2024,
  • [30] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Myers, Devon
    Mohawesh, Rami
    Chellaboina, Venkata Ishwarya
    Sathvik, Anantha Lakshmi
    Venkatesh, Praveen
    Ho, Yi-Hui
    Henshaw, Hanna
    Alhawawreh, Muna
    Berdik, David
    Jararweh, Yaser
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26