Pre-trained language models: What do they know?

被引:0
|
作者
Guimaraes, Nuno [1 ,2 ]
Campos, Ricardo [1 ,3 ,4 ]
Jorge, Alipio [1 ,2 ]
机构
[1] LIAAD INESCTEC, Porto, Portugal
[2] Univ Porto, Porto, Portugal
[3] Univ Beira Interior, Covilha, Portugal
[4] Polytech Inst Tomar, Ci2 Smart Cities Res Ctr, Tomar, Portugal
关键词
large language models; natural language; pretrained language models; processing;
D O I
10.1002/widm.1518
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Prompt Tuning for Discriminative Pre-trained Language Models
    Yao, Yuan
    Dong, Bowen
    Zhang, Ao
    Zhang, Zhengyan
    Xie, Ruobing
    Liu, Zhiyuan
    Lin, Leyu
    Sun, Maosong
    Wang, Jianyong
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3468 - 3473
  • [22] Impact of Morphological Segmentation on Pre-trained Language Models
    Westhelle, Matheus
    Bencke, Luciana
    Moreira, Viviane P.
    [J]. INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 402 - 416
  • [23] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [24] Self-conditioning Pre-Trained Language Models
    Suau, Xavier
    Zappella, Luca
    Apostoloff, Nicholas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [25] InA: Inhibition Adaption on pre-trained language models
    Kang, Cheng
    Prokop, Jindrich
    Tong, Lei
    Zhou, Huiyu
    Hu, Yong
    Novak, Daniel
    [J]. NEURAL NETWORKS, 2024, 178
  • [26] Leveraging Pre-trained Language Models for Gender Debiasing
    Jain, Nishtha
    Popovic, Maja
    Groves, Declan
    Specia, Lucia
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
  • [27] On the Sentence Embeddings from Pre-trained Language Models
    Li, Bohan
    Zhou, Hao
    He, Junxian
    Wang, Mingxuan
    Yang, Yiming
    Li, Lei
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
  • [28] Compressing Pre-trained Language Models by Matrix Decomposition
    Ben Noach, Matan
    Goldberg, Yoav
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 884 - 889
  • [29] Pre-trained language models for keyphrase prediction: A review
    Umair, Muhammad
    Sultana, Tangina
    Lee, Young-Koo
    [J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
  • [30] Deep Entity Matching with Pre-Trained Language Models
    Li, Yuliang
    Li, Jinfeng
    Suhara, Yoshihiko
    Doan, AnHai
    Tan, Wang-Chiew
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (01): : 50 - 60