A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning

被引:18
|
作者
Kotei, Evans [1 ]
Thirunavukarasu, Ramkumar [1 ]
机构
[1] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India
关键词
transformer network; transfer learning; pretraining; natural language processing; language models; BERT;
D O I
10.3390/info14030187
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion and exclusion parameters, relevant publications were selected for review. The study focused on transfer learning pretrained NLP models based on the deep transformer network. BERT and GPT were the two elite pretrained models trained to classify global and local representations based on larger unlabeled text datasets through self-supervised learning. Pretrained transformer models offer numerous advantages to natural language processing models, such as knowledge transfer to downstream tasks that deal with drawbacks associated with training a model from scratch. This review gives a comprehensive view of transformer architecture, self-supervised learning and pretraining concepts in language models, and their adaptation to downstream tasks. Finally, we present future directions to further improvement in pretrained transformer-based language models.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] GhostEncoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning
    Wang, Qiannan
    Yin, Changchun
    Fang, Liming
    Liu, Zhe
    Wang, Run
    Lin, Chenhao
    [J]. COMPUTERS & SECURITY, 2024, 142
  • [32] Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification
    Bazaluk, Bruna
    Hamdan, Mosab
    Ghaleb, Mustafa
    Gismalla, Mohammed S. M.
    da Silva, Flavio S. Correa
    Batista, Daniel Macedo
    [J]. PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024, 2024,
  • [33] Self-conditioning Pre-Trained Language Models
    Suau, Xavier
    Zappella, Luca
    Apostoloff, Nicholas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [34] Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning
    Yu, Youngjae
    Chung, Jiwan
    Yun, Heeseung
    Hessel, Jack
    Park, Jae Sung
    Lu, Ximing
    Zellers, Rowan
    Ammanabrolu, Prithviraj
    Le Bras, Ronan
    Kim, Gunhee
    Choi, Yejin
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10845 - 10856
  • [35] A Brief Review of Relation Extraction Based on Pre-Trained Language Models
    Xu, Tiange
    Zhang, Fu
    [J]. FUZZY SYSTEMS AND DATA MINING VI, 2020, 331 : 775 - 789
  • [36] Pre-trained language models for keyphrase prediction: A review
    Umair, Muhammad
    Sultana, Tangina
    Lee, Young-Koo
    [J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
  • [37] Transformer-Based Self-Supervised Monocular Depth and Visual Odometry
    Zhao, Hongru
    Qiao, Xiuquan
    Ma, Yi
    Tafazolli, Rahim
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (02) : 1436 - 1446
  • [38] Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
    Zhao, Xiao-Ying
    Zhu, Qiu-Shi
    Zhang, Jie
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 330 - 334
  • [39] A Transformer Based Approach To Detect Suicidal Ideation Using Pre-Trained Language Models
    Haque, Farsheed
    Nur, Ragib Un
    Al Jahan, Shaeekh
    Mahmud, Zarar
    Shah, Faisal Muhammad
    [J]. 2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [40] CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning
    Kyungjin Cho
    Ki Duk Kim
    Yujin Nam
    Jiheon Jeong
    Jeeyoung Kim
    Changyong Choi
    Soyoung Lee
    Jun Soo Lee
    Seoyeon Woo
    Gil-Sun Hong
    Joon Beom Seo
    Namkug Kim
    [J]. Journal of Digital Imaging, 2023, 36 : 902 - 910