Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models

被引：6

作者：

Trad, Fouad ^{[1
]}

Chehab, Ali ^{[1
]}

机构：

[1] Amer Univ Beirut, Elect & Comp Engn, Beirut 11072020, Lebanon

来源：

MACHINE LEARNING AND KNOWLEDGE EXTRACTION | 2024年 / 6卷 / 01期

关键词：

large language models; prompt engineering; fine-tuning; phishing detection; URL;

D O I：

10.3390/make6010018

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2-all primarily developed for text generation-exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs.

引用

页码：367 / 384

页数：18

共 50 条

[21] Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
Dhamecha, Tejas Indulal
Murthy, Rudra, V
Bharadwaj, Samarth
Sankaranarayanan, Karthik
Bhattacharyya, Pushpak
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8584 - 8595
[22] Improve Performance of Fine-tuning Language Models with Prompting
Yang, Zijian Gyozo
Ligeti-Nagy, Noenn
[J]. INFOCOMMUNICATIONS JOURNAL, 2023, 15 : 62 - 68
[23] CONVFIT: Conversational Fine-Tuning of Pretrained Language Models
Vulic, Ivan
Su, Pei-Hao
Coope, Sam
Gerz, Daniela
Budzianowski, Pawel
Casanueva, Inigo
Mrksic, Nikola
Wen, Tsung-Hsien
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1151 - 1168
[24] Fine-tuning language models to recognize semantic relations
Roussinov, Dmitri
Sharoff, Serge
Puchnina, Nadezhda
[J]. LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (04) : 1463 - 1486
[25] Fine-Tuning Language Models with Just Forward Passes
Malladi, Sadhika
Gao, Tianyu
Nichani, Eshaan
Damian, Alex
Lee, Jason D.
Chen, Danqi
Arora, Sanjeev
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[26] Fine-tuning language models to recognize semantic relations
Dmitri Roussinov
Serge Sharoff
Nadezhda Puchnina
[J]. Language Resources and Evaluation, 2023, 57 : 1463 - 1486
[27] How fine can fine-tuning be? Learning efficient language models
Radiya-Dixit, Evani
Wang, Xin
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2435 - 2442
[28] Efficient Fine-Tuning Large Language Models for Knowledge-Aware Response Planning
Minh Nguyen
Kishan, K. C.
Toan Nguyen
Chadha, Ankit
Thuy Vu
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT II, 2023, 14170 : 593 - 611
[29] Leveraging error-assisted fine-tuning large language models for manufacturing excellence
Xia, Liqiao
Li, Chengxi
Zhang, Canbin
Liu, Shimin
Zheng, Pai
[J]. ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2024, 88
[30] Data Selection for Fine-tuning Large Language Models Using Transferred Shapley Values
Schoch, Stephanie
Mishra, Ritwick
Ji, Yangfeng
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 266 - 275

← 1 2 3 4 5 →