On the Effectiveness of Large Language Models in Domain-Specific Code Generation

被引:1
|
作者
Gu, Xiaodong [1 ]
Chen, Meng [1 ]
Lin, Yalan [1 ]
Hu, Yuhan [1 ]
Zhang, Hongyu [2 ]
Wan, Chengcheng [3 ]
Wei, Zhao [4 ]
Xu, Yong [4 ]
Wang, Juhong [4 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Chongqing Univ, Chongqing, Peoples R China
[3] East China Normal Univ, Shanghai, Peoples R China
[4] Tencent Inc, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
large language models; code generation; domain-specific program generation;
D O I
10.1145/3697012
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Large language models (LLMs) such as ChatGPT have shown remarkable capabilities in code generation. Despite significant achievements, they rely on enormous training data to acquire a broad spectrum of open domain knowledge. Besides, their evaluation revolves around open-domain benchmarks like HumanEval, which primarily consist of programming contests. Therefore, it is hard to fully characterize the intricacies and challenges associated with particular domains (e.g., Web, game, and math). In this article, we conduct an in-depth study of the LLMs in domain-specific code generation. Our results demonstrate that LLMs exhibit sub-optimal performance in generating domain-specific code, due to their limited proficiency in utilizing domain-specific libraries. We further observe that incorporating API knowledge as prompts can empower LLMs to generate more professional code. Based on these findings, we further investigate how to effectively incorporate API knowledge into the code generation process. We experiment with three strategies for incorporating domain knowledge, namely, external knowledge inquirer, chain-of-thought prompting, and chain-of-thought fine-tuning. We refer to these strategies as a new code generation approach called DomCoder. Experimental results show that all strategies of DomCoder improve the effectiveness of domain-specific code generation under certain settings.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] The OntoNL framework for natural language interface generation and a domain-specific application
    Karanastasi, Anastasia
    Zotos, Alexandros
    Christodoulakis, Stavros
    DIGITAL LIBRARIES: RESEARCH AND DEVELOPMENT, 2007, 4877 : 228 - 237
  • [42] Evaluating and Enhancing Large Language Models'Performancein Domain-Specific Medicine:Development and Usability StudyWith DocOA
    Chen, Xi
    Wang, Li
    You, Mingke
    Liu, Weizhi
    Fu, Yu
    Xu, Jie
    Zhang, Shaoting
    Chen, Gang
    Li, Kang
    Li, Jian
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [43] Connecting domain-specific features to source code: towards the automatization of dashboard generation
    Andrea Vázquez-Ingelmo
    Francisco José García-Peñalvo
    Roberto Therón
    Daniel Amo Filvà
    David Fonseca Escudero
    Cluster Computing, 2020, 23 : 1803 - 1816
  • [44] Connecting domain-specific features to source code: towards the automatization of dashboard generation
    Vazquez-Ingelmo, Andrea
    Jose Garcia-Penalvo, Francisco
    Theron, Roberto
    Amo Filva, Daniel
    Fonseca Escudero, David
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (03): : 1803 - 1816
  • [45] Large language model and domain-specific model collaboration for smart education
    Luo, Yawei
    Yang, Yi
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2024, 25 (03) : 333 - 341
  • [46] Trellis: A Domain-Specific Language for Hidden Markov Models with Sparse Transitions
    Hummelgren, Lars
    Palmkvist, Viktor
    Stjerna, Linnea
    Xu, Xuechun
    Jalden, Joakim
    Broman, David
    PROCEEDINGS OF THE 17TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2024, 2024, : 196 - 209
  • [47] Domain-specific language models training methodology for the in-car infotainment
    Ondas S.
    Gurcik M.
    Ondas, Stanislav (stanislav.ondas@tuke.sk), 1600, IOS Press BV (11): : 417 - 422
  • [48] A DOMAIN-SPECIFIC LANGUAGE FOR ROUTING PROBLEMS
    Hoffmann, Benjamin
    Guckert, Michael
    Farrenkopf, Thomas
    Chalmers, Kevin
    Urquhart, Neil
    32ND EUROPEAN CONFERENCE ON MODELLING AND SIMULATION (ECMS 2018), 2018, : 262 - 268
  • [49] A Domain-Specific Language for Ubiquitous Healthcare
    Munnelly, Jennifer
    Clarke, Siobhan
    2008 3RD INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2, 2008, : 759 - 764
  • [50] Language Protocols for Domain-Specific Debugging
    Enet, Josselin
    ACM/IEEE 27TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS: COMPANION PROCEEDINGS, MODELS 2024, 2024, : 204 - 207