End-to-End Transformer-Based Models in Textual-Based NLP

被引:27
|
作者
Rahali, Abir [1 ]
Akhloufi, Moulay A. [1 ]
机构
[1] Univ Moncton, Dept Comp Sci, Percept Robot & Intelligent Machines Res Grp PRIME, Moncton, NB E1A 3E9, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Transformers; deep learning; natural language processing; transfer learning; PRE-TRAINED BERT; PREDICTION; SYSTEMS;
D O I
10.3390/ai4010004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer's standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.
引用
收藏
页码:54 / 110
页数:57
相关论文
共 50 条
  • [21] TransOrga: End-To-End Multi-modal Transformer-Based Organoid Segmentation
    Qin, Yiming
    Li, Jiajia
    Chen, Yulong
    Wang, Zikai
    Huang, Yu-An
    You, Zhuhong
    Hu, Lun
    Hu, Pengwei
    Tan, Feng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 460 - 472
  • [22] TOD-Net: An end-to-end transformer-based object detection network
    Sirisha, Museboyina
    Sudha, S. V.
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 108
  • [23] On Robustness of Finetuned Transformer-based NLP Models
    Neerudu, Pavan Kalyan Reddy
    Oota, Subba Reddy
    Marreddy, Mounika
    Kagita, Venkateswara Rao
    Gupta, Manish
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7180 - 7195
  • [24] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
    Maekaku, Takashi
    Fujita, Yuya
    Peng, Yifan
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 1071 - 1075
  • [25] OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
    Zhao, Jiaqi
    Ding, Zeyu
    Zhou, Yong
    Zhu, Hancheng
    Du, Wen-Liang
    Yao, Rui
    El Saddik, Abdulmotaleb
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [26] HyperSFormer: A Transformer-Based End-to-End Hyperspectral Image Classification Method for Crop Classification
    Xie, Jiaxing
    Hua, Jiajun
    Chen, Shaonan
    Wu, Peiwen
    Gao, Peng
    Sun, Daozong
    Lyu, Zhendong
    Lyu, Shilei
    Xue, Xiuyun
    Lu, Jianqiang
    REMOTE SENSING, 2023, 15 (14)
  • [27] An Empirical Study on Transformer-Based End-to-End Speech Recognition with Novel Decoder Masking
    Weng, Shi-Yan
    Chiu, Hsuan-Sheng
    Chen, Berlin
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 518 - 522
  • [28] Intra-hour solar irradiance forecasting: An end-to-end Transformer-based network
    Song, Kang
    Wang, Kai
    Wang, Shibo
    Wang, Nan
    Zhang, Jingxin
    Zhang, Kanjian
    Wei, Haikun
    39TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION, YAC 2024, 2024, : 526 - 531
  • [29] FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition
    Yang, Da-Hee
    Chang, Joon-Hyuk
    INTERSPEECH 2022, 2022, : 4098 - 4102
  • [30] Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
    Maekaku, Takashi
    Fujita, Yuya
    Peng, Yifan
    Watanabe, Shinji
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2022, 2022-September : 1071 - 1075