End-to-End Transformer-Based Models in Textual-Based NLP

被引：27

作者：

Rahali, Abir ^{[1
]}

Akhloufi, Moulay A. ^{[1
]}

机构：

[1] Univ Moncton, Dept Comp Sci, Percept Robot & Intelligent Machines Res Grp PRIME, Moncton, NB E1A 3E9, Canada

来源：

AI | 2023年 / 4卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Transformers; deep learning; natural language processing; transfer learning; PRE-TRAINED BERT; PREDICTION; SYSTEMS;

D O I：

10.3390/ai4010004

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer's standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.

引用

页码：54 / 110

页数：57

共 50 条

[21] TransOrga: End-To-End Multi-modal Transformer-Based Organoid Segmentation
Qin, Yiming
Li, Jiajia
Chen, Yulong
Wang, Zikai
Huang, Yu-An
You, Zhuhong
Hu, Lun
Hu, Pengwei
Tan, Feng
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 460 - 472
[22] TOD-Net: An end-to-end transformer-based object detection network
Sirisha, Museboyina
Sudha, S. V.
COMPUTERS & ELECTRICAL ENGINEERING, 2023, 108
[23] On Robustness of Finetuned Transformer-based NLP Models
Neerudu, Pavan Kalyan Reddy
Oota, Subba Reddy
Marreddy, Mounika
Kagita, Venkateswara Rao
Gupta, Manish
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7180 - 7195
[24] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
Maekaku, Takashi
Fujita, Yuya
Peng, Yifan
Watanabe, Shinji
INTERSPEECH 2022, 2022, : 1071 - 1075
[25] OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
Zhao, Jiaqi
Ding, Zeyu
Zhou, Yong
Zhu, Hancheng
Du, Wen-Liang
Yao, Rui
El Saddik, Abdulmotaleb
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[26] HyperSFormer: A Transformer-Based End-to-End Hyperspectral Image Classification Method for Crop Classification
Xie, Jiaxing
Hua, Jiajun
Chen, Shaonan
Wu, Peiwen
Gao, Peng
Sun, Daozong
Lyu, Zhendong
Lyu, Shilei
Xue, Xiuyun
Lu, Jianqiang
REMOTE SENSING, 2023, 15 (14)
[27] An Empirical Study on Transformer-Based End-to-End Speech Recognition with Novel Decoder Masking
Weng, Shi-Yan
Chiu, Hsuan-Sheng
Chen, Berlin
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 518 - 522
[28] Intra-hour solar irradiance forecasting: An end-to-end Transformer-based network
Song, Kang
Wang, Kai
Wang, Shibo
Wang, Nan
Zhang, Jingxin
Zhang, Kanjian
Wei, Haikun
39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 526 - 531
[29] FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition
Yang, Da-Hee
Chang, Joon-Hyuk
INTERSPEECH 2022, 2022, : 4098 - 4102
[30] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
Maekaku, Takashi
Fujita, Yuya
Peng, Yifan
Watanabe, Shinji
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2022, 2022-September : 1071 - 1075

← 1 2 3 4 5 →