共 50 条
- [21] SEED-Bench: Benchmarking Multimodal Large Language Models 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308
- [22] Quantifying Bias in Agentic Large Language Models: A Benchmarking Approach 2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 349 - 353
- [24] Using Large Language Models for Robot-Assisted Therapeutic Role-Play: Factuality is not enough! PROCEEDINGS OF THE 6TH CONFERENCE ON ACM CONVERSATIONAL USER INTERFACES, CUI 2024, 2024,
- [25] Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14820 - 14835
- [28] Benchmarking Large Language Models for Automated Verilog RTL Code Generation 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
- [29] Benchmarking Large Language Models on Controllable Generation under Diversified Instructions THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17808 - 17816
- [30] Benchmarking Causal Study to Interpret Large Language Models for Source Code 2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 329 - 334