Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models

被引:6
|
作者
Trad, Fouad [1 ]
Chehab, Ali [1 ]
机构
[1] Amer Univ Beirut, Elect & Comp Engn, Beirut 11072020, Lebanon
来源
关键词
large language models; prompt engineering; fine-tuning; phishing detection; URL;
D O I
10.3390/make6010018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2-all primarily developed for text generation-exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs.
引用
收藏
页码:367 / 384
页数:18
相关论文
共 50 条
  • [1] Fine-tuning and prompt engineering for large language models-based code review automation
    Pornprasit, Chanathip
    Tantithamthavorn, Chakkrit
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
  • [2] Prompting or Fine-tuning? A Comparative Study of Large Language Models for Taxonomy Construction
    Chen, Boqi
    Yi, Fandi
    Varro, Daniel
    [J]. 2023 ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS COMPANION, MODELS-C, 2023, : 588 - 596
  • [3] Getting it right: the limits of fine-tuning large language models
    Browning, Jacob
    [J]. ETHICS AND INFORMATION TECHNOLOGY, 2024, 26 (02)
  • [4] Scaling Federated Learning for Fine-Tuning of Large Language Models
    Hilmkil, Agrin
    Callh, Sebastian
    Barbieri, Matteo
    Sutfeld, Leon Rene
    Zec, Edvin Listo
    Mogren, Olof
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
  • [5] Fine-tuning large language models for chemical text mining
    Zhang, Wei
    Wang, Qinggong
    Kong, Xiangtai
    Xiong, Jiacheng
    Ni, Shengkun
    Cao, Duanhua
    Niu, Buying
    Chen, Mingan
    Li, Yameng
    Zhang, Runze
    Wang, Yitian
    Zhang, Lehan
    Li, Xutong
    Xiong, Zhaoping
    Shi, Qian
    Huang, Ziming
    Fu, Zunyun
    Zheng, Mingyue
    [J]. CHEMICAL SCIENCE, 2024, 15 (27) : 10600 - 10611
  • [6] Fine-tuning large neural language models for biomedical natural language processing
    Tinn, Robert
    Cheng, Hao
    Gu, Yu
    Usuyama, Naoto
    Liu, Xiaodong
    Naumann, Tristan
    Gao, Jianfeng
    Poon, Hoifung
    [J]. PATTERNS, 2023, 4 (04):
  • [7] Using Diffusion Models for Dataset Generation: Prompt Engineering vs. Fine-Tuning
    Voetman, Roy
    van Meekeren, Alexander
    Aghaei, Maya
    Dijkstra, Klaas
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 143 - 153
  • [8] A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter
    Xing, Jialu
    Liu, Jianping
    Wang, Jian
    Sun, Lulu
    Chen, Xi
    Gu, Xunxun
    Wang, Yingfei
    [J]. COMPUTERS & GRAPHICS-UK, 2024, 119
  • [9] An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair
    Huang, Kai
    Meng, Xiangxin
    Zhang, Jian
    Liu, Yang
    Wang, Wenjie
    Li, Shuhao
    Zhang, Yuqing
    [J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1162 - 1174
  • [10] Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial
    Sommers, Frank
    Kongthon, Alisa
    Kongyoung, Sarawoot
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1319 - 1320