共 50 条
- [1] Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6325 - 6341
- [2] Generating Benchmarks for Factuality Evaluation of Language Models PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 49 - 66
- [4] Benchmarking medical large language models NATURE REVIEWS BIOENGINEERING, 2023, 1 (08): : 543 - 543
- [5] Benchmarking Large Language Models on CFLUE - A Chinese Financial Language Understanding Evaluation Dataset FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5673 - 5693
- [6] Benchmarking DNA large language models on quadruplexes COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2025, 27 : 992 - 1000
- [7] Benchmarking AutoGen with different large language models 2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 263 - 264
- [9] Benchmarking Large Language Models: Opportunities and Challenges PERFORMANCE EVALUATION AND BENCHMARKING, TPCTC 2023, 2024, 14247 : 77 - 89