End-to-End Transformer-Based Models in Textual-Based NLP

被引:17
|
作者
Rahali, Abir [1 ]
Akhloufi, Moulay A. [1 ]
机构
[1] Univ Moncton, Dept Comp Sci, Percept Robot & Intelligent Machines Res Grp PRIME, Moncton, NB E1A 3E9, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Transformers; deep learning; natural language processing; transfer learning; PRE-TRAINED BERT; PREDICTION; SYSTEMS;
D O I
10.3390/ai4010004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer's standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.
引用
收藏
页码:54 / 110
页数:57
相关论文
共 50 条
  • [41] CarcassFormer: an end-to-end transformer-based framework for simultaneous localization, segmentation and classification of poultry carcass defect
    Tran, Minh
    Truong, Sang
    Fernandes, Arthur F. A.
    Kidd, Michael T.
    Le, Ngan
    [J]. POULTRY SCIENCE, 2024, 103 (08)
  • [42] End to end transformer-based contextual speech recognition based on pointer network
    Lin, Binghuai
    Wang, Liyuan
    [J]. INTERSPEECH 2021, 2021, : 2087 - 2091
  • [43] Transformer-Based 3D Face Reconstruction With End-to-End Shape-Preserved Domain Transfer
    Chen, Zhuo
    Wang, Yuesong
    Guan, Tao
    Xu, Luoyuan
    Liu, Wenkai
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8383 - 8393
  • [44] Hierarchical transformer-based large-context end-to-end ASR with large-context knowledge distillation
    Masumura, Ryo
    Makishima, Naoki
    Ihori, Mana
    Takashima, Akihiko
    Tanaka, Tomohiro
    Orihashi, Shota
    [J]. arXiv, 2021,
  • [45] E2EET: from pipeline to end-to-end entity typing via transformer-based embeddings
    Stewart, Michael
    Liu, Wei
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (01) : 95 - 113
  • [46] HIERARCHICAL TRANSFORMER-BASED LARGE-CONTEXT END-TO-END ASR WITH LARGE-CONTEXT KNOWLEDGE DISTILLATION
    Masumura, Ryo
    Makishima, Naoki
    Ihori, Mana
    Takashima, Akihiko
    Tanaka, Tomohiro
    Orihashi, Shota
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5879 - 5883
  • [47] End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention
    Wei, Bo
    Yang, Meirong
    Zhang, Tao
    Tang, Xiao
    Huang, Xing
    Kim, Kyuhong
    Lee, Jaeyun
    Cho, Kiho
    Park, Sung-Un
    [J]. INTERSPEECH 2021, 2021, : 361 - 365
  • [48] E2EET: from pipeline to end-to-end entity typing via transformer-based embeddings
    Michael Stewart
    Wei Liu
    [J]. Knowledge and Information Systems, 2022, 64 : 95 - 113
  • [49] TRMER: Transformer-Based End to End Printed Mathematical Expression Recognition
    Zhou, Zhaokun
    Ji, Shuaijian
    Wang, Yuqing
    Weng, Zhenyu
    Zhu, Yuesheng
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [50] Semantic Mask for Transformer based End-to-End Speech Recognition
    Wang, Chengyi
    Wu, Yu
    Du, Yujiao
    Li, Jinyu
    Liu, Shujie
    Lu, Liang
    Ren, Shuo
    Ye, Guoli
    Zhao, Sheng
    Zhou, Ming
    [J]. INTERSPEECH 2020, 2020, : 971 - 975