Pre-trained language models: What do they know?

被引：0

作者：

Guimaraes, Nuno ^{[1
,2
]}

Campos, Ricardo ^{[1
,3
,4
]}

Jorge, Alipio ^{[1
,2
]}

机构：

[1] LIAAD INESCTEC, Porto, Portugal

[2] Univ Porto, Porto, Portugal

[3] Univ Beira Interior, Covilha, Portugal

[4] Polytech Inst Tomar, Ci2 Smart Cities Res Ctr, Tomar, Portugal

来源：

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY | 2024年 / 14卷 / 01期

关键词：

large language models; natural language; pretrained language models; processing;

D O I：

10.1002/widm.1518

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence

引用

页数：10

共 50 条

[41] Evaluating and Inducing Personality in Pre-trained Language Models
Jiang, Guangyuan
Xu, Manjie
Zhu, Song-Chun
Han, Wenjuan
Zhang, Chi
Zhu, Yixin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[42] Evaluating the Summarization Comprehension of Pre-Trained Language Models
D. I. Chernyshev
B. V. Dobrov
[J]. Lobachevskii Journal of Mathematics, 2023, 44 : 3028 - 3039
[43] Pre-trained models for natural language processing: A survey
XiPeng Qiu
TianXiang Sun
YiGe Xu
YunFan Shao
Ning Dai
XuanJing Huang
[J]. Science China Technological Sciences, 2020, 63 : 1872 - 1897
[44] Robust Lottery Tickets for Pre-trained Language Models
Zheng, Rui
Bao, Rong
Zhou, Yuhao
Liang, Di
Wane, Sirui
Wu, Wei
Gui, Tao
Zhang, Qi
Huang, Xuanjing
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2211 - 2224
[45] Pre-Trained Language Models for Text Generation: A Survey
Li, Junyi
Tang, Tianyi
Zhao, Wayne Xin
Nie, Jian-Yun
Wen, Ji-Rong
[J]. ACM COMPUTING SURVEYS, 2024, 56 (09)
[46] Leveraging pre-trained language models for code generation
Soliman, Ahmed
Shaheen, Samir
Hadhoud, Mayada
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
[47] What Makes Pre-trained Language Models Better Zero-shot Learners?
Lu, Jinghui
Zhu, Dongsheng
Han, Weidong
Zhao, Rui
Mac Namee, Brian
Tan, Fei
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2288 - 2303
[48] Modeling Second Language Acquisition with pre-trained neural language models
Palenzuela, Alvaro J. Jimenez
Frasincar, Flavius
Trusca, Maria Mihaela
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
[49] μBERT: Mutation Testing using Pre-Trained Language Models
Degiovanni, Renzo
Papadakis, Mike
[J]. 2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022), 2022, : 160 - 169
[50] In-Context Analogical Reasoning with Pre-Trained Language Models
Hu, Xiaoyang
Storks, Shane
Lewis, Richard L.
Chai, Joyce
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1953 - 1969

← 1 2 3 4 5 →