Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study

被引：0

作者：

Kabongo, Salomon ^{[1
]}

D'Souza, Jennifer ^{[2
]}

Auer, Soren ^{[2
]}

机构：

[1] Leibniz Univ Hannover, L3S Res Ctr, Hannover, Germany

[2] TIB Leibniz Informat Ctr Sci & Technol, Hannover, Germany

来源：

NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024 | 2024年 / 14763卷

关键词：

D O I：

10.1007/978-3-031-70242-6_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the state of the art in leaderboard generation but also sheds light on strategies to mitigate common challenges in LLM-based information extraction.

引用

页码：150 / 160

页数：11

共 50 条

[31] Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation
Liu, Chenxiao
Xie, Zheyong
Zhao, Sirui
Zhou, Jin
Xu, Tong
Li, Minglei
Chen, Enhong
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 533 - 542
[32] Reproducibility of LLM-based Recommender Systems: the Case Study of P5 Paradigm
Lops, Pasquale
Silletti, Antonio
Polignano, Marco
Musto, Cataldo
Semeraro, Giovanni
PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 116 - 125
[33] "Artificial Intelligence - Carrying us into the Future": A Study of Older Adults' Perceptions of LLM-Based Chatbots
Enam, M. D. Atik
Murmu, Chandni
Dixon, Emma
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2025,
[34] GenG: An LLM-based Generic Time Series Data Generation Approach for Edge Intelligence via Cross-domain Collaboration
Zhou, Xiaomao
Jia, Qingmin
Hu, Yujiao
Xie, Renchao
Huang, Tao
Yu, E. Richard
IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
[35] An Innovative Solution to Design Problems: Applying the Chain-of-Thought Technique to Integrate LLM-Based Agents With Concept Generation Methods
Ge, Shijun
Sun, Yuanbo
Cui, Yin
Wei, Dapeng
IEEE ACCESS, 2025, 13 : 10499 - 10512
[36] Comparative Analysis of M4CXR, an LLM-Based Chest X-Ray Report Generation Model, and ChatGPT in Radiological Interpretation
Lee, Ro Woon
Lee, Kyu Hong
Yun, Jae Sung
Kim, Myung Sub
Choi, Hyun Seok
JOURNAL OF CLINICAL MEDICINE, 2024, 13 (23)
[37] Evaluating LLM-based generative AI tools in emergency triage: A comparative study of ChatGPT Plus, Copilot Pro, and triage nurses
Arslan, B.
Nuhoglu, C.
Satici, M. O.
Altinbilek, E.
AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2025, 89 : 174 - 181
[38] EFFECTIVE ALLELE PRESERVATION BY OFFSPRING SELECTION: AN EMPIRICAL STUDY FOR THE TSP
Affenzeller, Michael
Wagner, Stefan
Winkler, Stephan
EMSS 2008: 20TH EUROPEAN MODELING AND SIMULATION SYMPOSIUM, 2008, : 59 - 68
[39] Semantic vs. LLM-based approach: A case study of KOnPoTe vs. Claude for ontology population from French advertisements
Sahbi, Aya
Alec, Celine
Beust, Pierre
DATA & KNOWLEDGE ENGINEERING, 2025, 156
[40] Artificial intelligence for health message generation: an empirical study using a large language model (LLM) and prompt engineering
Lim, Sue
Schmalzle, Ralf
FRONTIERS IN COMMUNICATION, 2023, 8

← 1 2 3 4 5 →