Pre-trained language models: What do they know?

被引：0

作者：

Guimaraes, Nuno ^{[1
,2
]}

Campos, Ricardo ^{[1
,3
,4
]}

Jorge, Alipio ^{[1
,2
]}

机构：

[1] LIAAD INESCTEC, Porto, Portugal

[2] Univ Porto, Porto, Portugal

[3] Univ Beira Interior, Covilha, Portugal

[4] Polytech Inst Tomar, Ci2 Smart Cities Res Ctr, Tomar, Portugal

来源：

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY | 2024年 / 14卷 / 01期

关键词：

large language models; natural language; pretrained language models; processing;

D O I：

10.1002/widm.1518

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence

引用

页数：10

共 50 条

[21] Prompt Tuning for Discriminative Pre-trained Language Models
Yao, Yuan
Dong, Bowen
Zhang, Ao
Zhang, Zhengyan
Xie, Ruobing
Liu, Zhiyuan
Lin, Leyu
Sun, Maosong
Wang, Jianyong
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3468 - 3473
[22] Impact of Morphological Segmentation on Pre-trained Language Models
Westhelle, Matheus
Bencke, Luciana
Moreira, Viviane P.
[J]. INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 402 - 416
[23] A Survey of Knowledge Enhanced Pre-Trained Language Models
Hu, Linmei
Liu, Zeyi
Zhao, Ziwang
Hou, Lei
Nie, Liqiang
Li, Juanzi
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
[24] Self-conditioning Pre-Trained Language Models
Suau, Xavier
Zappella, Luca
Apostoloff, Nicholas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[25] InA: Inhibition Adaption on pre-trained language models
Kang, Cheng
Prokop, Jindrich
Tong, Lei
Zhou, Huiyu
Hu, Yong
Novak, Daniel
[J]. NEURAL NETWORKS, 2024, 178
[26] Leveraging Pre-trained Language Models for Gender Debiasing
Jain, Nishtha
Popovic, Maja
Groves, Declan
Specia, Lucia
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
[27] On the Sentence Embeddings from Pre-trained Language Models
Li, Bohan
Zhou, Hao
He, Junxian
Wang, Mingxuan
Yang, Yiming
Li, Lei
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
[28] Compressing Pre-trained Language Models by Matrix Decomposition
Ben Noach, Matan
Goldberg, Yoav
[J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 884 - 889
[29] Pre-trained language models for keyphrase prediction: A review
Umair, Muhammad
Sultana, Tangina
Lee, Young-Koo
[J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
[30] Deep Entity Matching with Pre-Trained Language Models
Li, Yuliang
Li, Jinfeng
Suhara, Yoshihiko
Doan, AnHai
Tan, Wang-Chiew
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60

← 1 2 3 4 5 →