Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models

被引：6

作者：

Trad, Fouad ^{[1
]}

Chehab, Ali ^{[1
]}

机构：

[1] Amer Univ Beirut, Elect & Comp Engn, Beirut 11072020, Lebanon

来源：

MACHINE LEARNING AND KNOWLEDGE EXTRACTION | 2024年 / 6卷 / 01期

关键词：

large language models; prompt engineering; fine-tuning; phishing detection; URL;

D O I：

10.3390/make6010018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2-all primarily developed for text generation-exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs.

引用

页码：367 / 384

页数：18

共 50 条

[1] Fine-tuning and prompt engineering for large language models-based code review automation
Pornprasit, Chanathip
Tantithamthavorn, Chakkrit
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2024, 175
[2] Prompting or Fine-tuning? A Comparative Study of Large Language Models for Taxonomy Construction
Chen, Boqi
Yi, Fandi
Varro, Daniel
[J]. 2023 ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS COMPANION, MODELS-C, 2023, : 588 - 596
[3] Getting it right: the limits of fine-tuning large language models
Browning, Jacob
[J]. ETHICS AND INFORMATION TECHNOLOGY, 2024, 26 (02)
[4] Scaling Federated Learning for Fine-Tuning of Large Language Models
Hilmkil, Agrin
Callh, Sebastian
Barbieri, Matteo
Sutfeld, Leon Rene
Zec, Edvin Listo
Mogren, Olof
[J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 15 - 23
[5] Fine-tuning large language models for chemical text mining
Zhang, Wei
Wang, Qinggong
Kong, Xiangtai
Xiong, Jiacheng
Ni, Shengkun
Cao, Duanhua
Niu, Buying
Chen, Mingan
Li, Yameng
Zhang, Runze
Wang, Yitian
Zhang, Lehan
Li, Xutong
Xiong, Zhaoping
Shi, Qian
Huang, Ziming
Fu, Zunyun
Zheng, Mingyue
[J]. CHEMICAL SCIENCE, 2024, 15 (27) : 10600 - 10611
[6] Fine-tuning large neural language models for biomedical natural language processing
Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
[J]. PATTERNS, 2023, 4 (04):
[7] Using Diffusion Models for Dataset Generation: Prompt Engineering vs. Fine-Tuning
Voetman, Roy
van Meekeren, Alexander
Aghaei, Maya
Dijkstra, Klaas
[J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 143 - 153
[8] A survey of efficient fine-tuning methods for Vision-Language Models - Prompt and Adapter
Xing, Jialu
Liu, Jianping
Wang, Jian
Sun, Lulu
Chen, Xi
Gu, Xunxun
Wang, Yingfei
[J]. COMPUTERS & GRAPHICS-UK, 2024, 119
[9] An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair
Huang, Kai
Meng, Xiangxin
Zhang, Jian
Liu, Yang
Wang, Wenjie
Li, Shuhao
Zhang, Yuqing
[J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1162 - 1174
[10] Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial
Sommers, Frank
Kongthon, Alisa
Kongyoung, Sarawoot
[J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1319 - 1320

← 1 2 3 4 5 →