A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning

被引：18

作者：

Kotei, Evans ^{[1
]}

Thirunavukarasu, Ramkumar ^{[1
]}

机构：

[1] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, India

来源：

INFORMATION | 2023年 / 14卷 / 03期

关键词：

transformer network; transfer learning; pretraining; natural language processing; language models; BERT;

D O I：

10.3390/info14030187

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion and exclusion parameters, relevant publications were selected for review. The study focused on transfer learning pretrained NLP models based on the deep transformer network. BERT and GPT were the two elite pretrained models trained to classify global and local representations based on larger unlabeled text datasets through self-supervised learning. Pretrained transformer models offer numerous advantages to natural language processing models, such as knowledge transfer to downstream tasks that deal with drawbacks associated with training a model from scratch. This review gives a comprehensive view of transformer architecture, self-supervised learning and pretraining concepts in language models, and their adaptation to downstream tasks. Finally, we present future directions to further improvement in pretrained transformer-based language models.

引用

页数：25

共 50 条

[41] Transformer-Based Self-Supervised Monocular Depth and Visual Odometry
Zhao, Hongru
Qiao, Xiuquan
Ma, Yi
Tafazolli, Rahim
[J]. IEEE SENSORS JOURNAL, 2023, 23 (02) : 1436 - 1446
[42] CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning
Kyungjin Cho
Ki Duk Kim
Yujin Nam
Jiheon Jeong
Jeeyoung Kim
Changyong Choi
Soyoung Lee
Jun Soo Lee
Seoyeon Woo
Gil-Sun Hong
Joon Beom Seo
Namkug Kim
[J]. Journal of Digital Imaging, 2023, 36 : 902 - 910
[43] CheSS: Chest X-Ray Pre-trained Model via Self-supervised Contrastive Learning
Cho, Kyungjin
Kim, Ki Duk
Nam, Yujin
Jeong, Jiheon
Kim, Jeeyoung
Choi, Changyong
Lee, Soyoung
Lee, Jun Soo
Woo, Seoyeon
Hong, Gil-Sun
Seo, Joon Beom
Kim, Namkug
[J]. JOURNAL OF DIGITAL IMAGING, 2023, 36 (03) : 902 - 910
[44] Adapting Pre-Trained Self-Supervised Learning Model for Speech Recognition with Light-Weight Adapters
Yue, Xianghu
Gao, Xiaoxue
Qian, Xinyuan
Li, Haizhou
[J]. ELECTRONICS, 2024, 13 (01)
[45] Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Wang, Benyou
Xie, Qianqian
Pei, Jiahuan
Chen, Zhihong
Tiwari, Prayag
Li, Zhao
Fu, Jie
[J]. ACM COMPUTING SURVEYS, 2024, 56 (03)
[46] A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression
Falaki, Ala Alam
Gras, Robin
[J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 160 - 167
[47] Meta Distant Transfer Learning for Pre-trained Language Models
Wang, Chengyu
Pan, Haojie
Qiu, Minghui
Yang, Fei
Huang, Jun
Zhang, Yin
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
[48] SSCLNet: A Self-Supervised Contrastive Loss-Based Pre-Trained Network for Brain MRI Classification
Mishra, Animesh
Jha, Ritesh
Bhattacharjee, Vandana
[J]. IEEE ACCESS, 2023, 11 : 6673 - 6681
[49] Explore the Use of Self-supervised Pre-trained Acoustic Features on Disguised Speech Detection
Quan, Jie
Yang, Yingchun
[J]. BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 483 - 490
[50] Mitigating Backdoor Attacks in Pre-Trained Encoders via Self-Supervised Knowledge Distillation
Bie, Rongfang
Jiang, Jinxiu
Xie, Hongcheng
Guo, Yu
Miao, Yinbin
Jia, Xiaohua
[J]. IEEE Transactions on Services Computing, 2024, 17 (05): : 2613 - 2625

← 1 2 3 4 5 →