ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code

被引:0
|
作者
Feng, Jia [1 ]
Liu, Jiachen [2 ]
Gao, Cuiyun [2 ]
Chong, Chun Yong [3 ]
Wang, Chaozheng [4 ]
Gao, Shan [5 ]
Xia, Xin [5 ]
机构
[1] University of Electronic Science and Technology of China, Shenzhen, China
[2] Harbin Institute of Technology, Shenzhen, China
[3] HUAWEI, Hong Kong, Hong Kong
[4] The Chinese University of Hong Kong, Hong Kong, Hong Kong
[5] Huawei, Shenzhen, China
关键词
Compendex;
D O I
暂无
中图分类号
学科分类号
摘要
Information leakage
引用
收藏
页码:1895 / 1906
相关论文
共 50 条
  • [1] ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code
    Feng, Jia
    Liu, Jiachen
    Gao, Cuiyun
    Chong, Chun Yong
    Wang, Chaozheng
    Gao, Shan
    Xia, Xin
    arXiv, 2024,
  • [2] JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language Models
    Cao, Jialun
    Chen, Zhiyong
    Wu, Jiarong
    Cheung, Shing-Chi
    Xu, Chang
    Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, : 870 - 882
  • [3] BioCoder: a benchmark for bioinformatics code generation with large language models
    Tang, Xiangru
    Qian, Bill
    Gao, Rick
    Chen, Jiakang
    Chen, Xinyun
    Gerstein, Mark B.
    BIOINFORMATICS, 2024, 40 : i266 - i276
  • [4] CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
    Zhou, Shuyan
    Alon, Uri
    Agarwal, Sumit
    Neubig, Graham
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13921 - 13937
  • [5] Framework for evaluating code generation ability of large language models
    Yeo, Sangyeop
    Ma, Yu-Seung
    Kim, Sang Cheol
    Jun, Hyungkook
    Kim, Taeho
    ETRI JOURNAL, 2024, 46 (01) : 106 - 117
  • [6] Evaluating Source Code Quality with Large Language Models: a comparative study
    da Silva Simões, Igor Regis
    Venson, Elaine
    arXiv,
  • [7] Invited Paper: VerilogEval: Evaluating Large Language Models for Verilog Code Generation
    Liu, Mingjie
    Pinckney, Nathaniel
    Khailany, Brucek
    Ren, Haoxing
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [8] Evaluating Large Language Models for Automated CPT Code Prediction in Endovascular Neurosurgery
    Roy, Joanna M.
    Self, D. Mitchell
    Isch, Emily
    Musmar, Basel
    Lan, Matthews
    Keppetipola, Kavantissa
    Koduri, Sravanthi
    Pontarelli, Mary-Katharine
    Tjoumakaris, Stavropoula I.
    Gooch, M. Reid
    Rosenwasser, Robert H.
    Jabbour, Pascal M.
    JOURNAL OF MEDICAL SYSTEMS, 2025, 49 (01)
  • [9] Evaluating Large Language Models for G-Code Debugging, Manipulation, and Comprehension
    Jignasu, Anushrut
    Marshall, Kelly
    Ganapathysubramanian, Baskar
    Balu, Aditya
    Hegde, Chinmay
    Krishnamurthy, Adarsh
    2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
  • [10] Evaluating and Optimizing the Effectiveness of Neural Machine Translation in Supporting Code Retrieval Models: A Study on the CAT Benchmark
    Phan, Hung
    Jannesari, Ali
    International Conference on Information and Knowledge Management, Proceedings, 2023, : 2055 - 2064