Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study

被引:0
|
作者
Kabongo, Salomon [1 ]
D'Souza, Jennifer [2 ]
Auer, Soren [2 ]
机构
[1] Leibniz Univ Hannover, L3S Res Ctr, Hannover, Germany
[2] TIB Leibniz Informat Ctr Sci & Technol, Hannover, Germany
关键词
D O I
10.1007/978-3-031-70242-6_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.
引用
收藏
页码:150 / 160
页数:11
相关论文
共 50 条
  • [1] LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation
    Fakhoury, Sarah
    Naik, Aaditya
    Sakkas, Georgios
    Chakraborty, Saikat
    Lahiri, Shuvendu K.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (09) : 2254 - 2268
  • [2] LLM-Based Business Process Documentation Generation
    Zhu, Rui
    Hu, Quanzhou
    Wen, Lijie
    Lin, Leilei
    Xiao, Honghao
    Wang, Chaogang
    SERVICE-ORIENTED COMPUTING, ICSOC 2024, PT I, 2025, 15404 : 381 - 390
  • [3] ChatUniTest: A Framework for LLM-Based Test Generation
    Chen, Yinghao
    Hu, Zehao
    Zhi, Chen
    Han, Junxiao
    Deng, Shuiguang
    Yin, Jianwei
    COMPANION PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, FSE COMPANION 2024, 2024, : 572 - 576
  • [4] LLM-Based Interaction for Content Generation: A Case Study on the Perception of Employees in an IT Department
    Agossah, Alexandre
    Krupa, Frederique
    Perreira Da Silva, Matthieu
    Le Callet, Patrick
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON INTERACTIVE MEDIA EXPERIENCES, IMX 2023, 2023, : 237 - 241
  • [5] Exploring the application of LLM-based AI in UX design: an empirical case study of ChatGPT
    Zhou, Zhibin
    Li, Yaoqi
    Yu, Junnan
    HUMAN-COMPUTER INTERACTION, 2024,
  • [6] Boosting LLM-Based Software Generation by Aligning Code with Requirements
    Yaacov, Tom
    Elyasaf, Achiya
    Weiss, Gera
    32ND INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS, REW 2024, 2024, : 301 - 305
  • [7] LLM-based Control Code Generation using Image Recognition
    Koziolek, Heiko
    Koziolek, Anne
    2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 38 - 45
  • [8] LLM-based and Retrieval-Augmented Control Code Generation
    Koziolek, Heiko
    Gruener, Sten
    Hark, Rhaban
    Ashiwal, Virendra
    Linsbauer, Sofia
    Eskandani, Nafise
    2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024, 2024, : 22 - 29
  • [9] The Power of Atmosphere: LLM-Based Social Task Generation of Robots
    Lee, Hanna
    Lym, Hyo Jeong
    Kim, Da-Young
    Kim, Min-Gyu
    2024 21ST INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS, UR 2024, 2024, : 532 - 538
  • [10] MedSyn: LLM-Based Synthetic Medical Text Generation Framework
    Kumichev, Gleb
    Blinov, Pavel
    Kuzkina, Yulia
    Goncharov, Vasily
    Zubkova, Galina
    Zenovkin, Nikolai
    Goncharov, Aleksei
    Savchenko, Andrey
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-APPLIED DATA SCIENCE TRACK, PT X, ECML PKDD 2024, 2024, 14950 : 215 - 230