Exploring Large Language Models in a Limited Resource Scenario

被引：0

作者：

Panchbhai, Anand ^{[1
]}

Pankanti, Smarana ^{[1
]}

机构：

[1] Indian Inst Technol Bhilai, Dept Elect Engn & Comp Sci, Logy AI, Raipur, Madhya Pradesh, India

来源：

2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021) | 2021年

关键词：

GPT-2; Sentiment-Analysis; Language-Models; Explainability; Limited-Resources; SENTIMENT ANALYSIS;

D O I：

10.1109/Confluence51648.2021.9377081

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Generative Pre-trained Transformers (GPT) have gained a lot of popularity in the domain of Natural Language Processing (NPL). Lately, GPTs have been fine-tuned for tasks like sentiment analysis and text summarization. As the number of tunable parameters increases with larger language models (like GPT-3), it becomes resource-heavy to fine-tune these models on commercially available personal computer systems. In addition to that, GPT-3 is only available through an API which makes it even harder to fine-tune it for a specific task. This makes these models less accessible to the general public and researchers. Alternative ways are required to better understand the nature of these language models and employ them for challenging NLP tasks without explicit fine-tuning. This study capitalizes on the raw capabilities of GPT-2, it proposes and proves the efficacy of one such system in the task of sentiment analysis without explicit fine-tuning. It also sheds light into the nature of such generative language models and shows how explainability can be exploited to achieve good results with minimum resources. It was observed that the proposed system does a good job of capturing the sentiment of a given text. It reached an accuracy of 82% on a part of the IMDB Data set of Movie Reviews. The system performed better with natural language prompt when compared to symbol-based syntactic prompts.

引用

页码：147 / 152

页数：6

共 50 条

[1] Exploring Large Language Models for Low-Resource IT Information Extraction
Bhavya, Bhavya
Isaza, Paulina Toro
Deng, Yu
Nidd, Michael
Azad, Amar Prakash
Shwartz, Larisa
Zhai, ChengXiang
[J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1203 - 1212
[2] Exploring Large Language Models for Classical Philology
Riemenschneider, Frederick
Frank, Anette
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15181 - 15199
[3] Leveraging Large Language Models for VNF Resource Forecasting
Su, Jing
Nair, Suku
Popokh, Leo
[J]. 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION, NETSOFT 2024, 2024, : 258 - 262
[4] Exploring Variability in Risk Taking With Large Language Models
Bhatia, Sudeep
[J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2024, 153 (07) : 1838 - 1860
[5] Exploring large language models for microstructure evolution in materials
Satpute, Prathamesh
Tiwari, Saurabh
Gupta, Maneet
Ghosh, Supriyo
[J]. MATERIALS TODAY COMMUNICATIONS, 2024, 40
[6] Exploring Capabilities of Large Language Models such as ChatGPT in Radiation
Dennstadt, Fabio
Hastings, Janna
Putora, Paul Martin
Vu, Erwin
Fischer, Galina F.
Suveg, Krisztian
Glatzer, Markus
Riggenbach, Elena
Ha, Hong-Linh
Cihoric, Nikola
[J]. ADVANCES IN RADIATION ONCOLOGY, 2024, 9 (03)
[7] Exploring Large Language Models in Intent Acquisition and Translation
Fontana, Mattia
Martini, Barbara
Sciarrone, Filippo
[J]. 2024 IEEE 10TH INTERNATIONAL CONFERENCE ON NETWORK SOFTWARIZATION, NETSOFT 2024, 2024, : 231 - 234
[8] Exploring Large Language Models for Verilog hardware design generation
D'Hollander, Erik H.
Danneels, Ewout
Decorte, Karel-Brecht
Loobuyck, Senne
Vanheule, Ame
Van Kets, Ian
Stroobandt, Dirk
[J]. 2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 111 - 115
[9] Exploring the role of large language models in radiation emergency response
Chandra, Anirudh
Chakraborty, Abinash
[J]. JOURNAL OF RADIOLOGICAL PROTECTION, 2024, 44 (01)
[10] Exploring Large Language Models for Trajectory Prediction: A Technical Perspective
Munir, Farzeen
Mihaylova, Tsvetomila
Azam, Shoaib
Kucner, Tomasz Piotr
Kyrki, Ville
[J]. COMPANION OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024 COMPANION, 2024, : 774 - 778

← 1 2 3 4 5 →