A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning

被引：18

作者：

Kotei, Evans ^{[1
]}

Thirunavukarasu, Ramkumar ^{[1
]}

机构：

[1] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India

来源：

INFORMATION | 2023年 / 14卷 / 03期

关键词：

transformer network; transfer learning; pretraining; natural language processing; language models; BERT;

D O I：

10.3390/info14030187

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion and exclusion parameters, relevant publications were selected for review. The study focused on transfer learning pretrained NLP models based on the deep transformer network. BERT and GPT were the two elite pretrained models trained to classify global and local representations based on larger unlabeled text datasets through self-supervised learning. Pretrained transformer models offer numerous advantages to natural language processing models, such as knowledge transfer to downstream tasks that deal with drawbacks associated with training a model from scratch. This review gives a comprehensive view of transformer architecture, self-supervised learning and pretraining concepts in language models, and their adaptation to downstream tasks. Finally, we present future directions to further improvement in pretrained transformer-based language models.

引用

页数：25

共 50 条

[31] GhostEncoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning
Wang, Qiannan
Yin, Changchun
Fang, Liming
Liu, Zhe
Wang, Run
Lin, Chenhao
[J]. COMPUTERS & SECURITY, 2024, 142
[32] Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification
Bazaluk, Bruna
Hamdan, Mosab
Ghaleb, Mustafa
Gismalla, Mohammed S. M.
da Silva, Flavio S. Correa
Batista, Daniel Macedo
[J]. PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024, 2024,
[33] Self-conditioning Pre-Trained Language Models
Suau, Xavier
Zappella, Luca
Apostoloff, Nicholas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[34] Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning
Yu, Youngjae
Chung, Jiwan
Yun, Heeseung
Hessel, Jack
Park, Jae Sung
Lu, Ximing
Zellers, Rowan
Ammanabrolu, Prithviraj
Le Bras, Ronan
Kim, Gunhee
Choi, Yejin
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10845 - 10856
[35] A Brief Review of Relation Extraction Based on Pre-Trained Language Models
Xu, Tiange
Zhang, Fu
[J]. FUZZY SYSTEMS AND DATA MINING VI, 2020, 331 : 775 - 789
[36] Pre-trained language models for keyphrase prediction: A review
Umair, Muhammad
Sultana, Tangina
Lee, Young-Koo
[J]. ICT EXPRESS, 2024, 10 (04): : 871 - 890
[37] Transformer-Based Self-Supervised Monocular Depth and Visual Odometry
Zhao, Hongru
Qiao, Xiuquan
Ma, Yi
Tafazolli, Rahim
[J]. IEEE SENSORS JOURNAL, 2023, 23 (02) : 1436 - 1446
[38] Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Zhao, Xiao-Ying
Zhu, Qiu-Shi
Zhang, Jie
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 330 - 334
[39] A Transformer Based Approach To Detect Suicidal Ideation Using Pre-Trained Language Models
Haque, Farsheed
Nur, Ragib Un
Al Jahan, Shaeekh
Mahmud, Zarar
Shah, Faisal Muhammad
[J]. 2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
[40] CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning
Kyungjin Cho
Ki Duk Kim
Yujin Nam
Jiheon Jeong
Jeeyoung Kim
Changyong Choi
Soyoung Lee
Jun Soo Lee
Seoyeon Woo
Gil-Sun Hong
Joon Beom Seo
Namkug Kim
[J]. Journal of Digital Imaging, 2023, 36 : 902 - 910

← 1 2 3 4 5 →