Pre-training and Evaluating Transformer-based Language Models for Icelandic

被引:0
|
作者
Daoason, Jon Friorik [1 ]
Loftsson, Hrafn [1 ]
机构
[1] Reykjavik Univ, Dept Comp Sci, Reykjavik, Iceland
关键词
Language Models; Transformer; Evaluation; Icelandic;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we evaluate several Transformer-based language models for Icelandic on four downstream tasks: Part-of-Speech tagging, Named Entity Recognition. Dependency Parsing, and Automatic Text Summarization. We pre-train four types of monolingual ELECTRA and ConvBERT models and compare our results to a previously trained monolingual RoBERTa model and the multilingual mBERT model. We find that the Transformer models obtain better results, often by a large margin, compared to previous state-of-the-art models. Furthermore, our results indicate that pre-training larger language models results in a significant reduction in error rates in comparison to smaller models. Finally, our results show that the monolingual models for Icelandic outperform a comparably sized multilingual model.
引用
收藏
页码:7386 / 7391
页数:6
相关论文
共 50 条
  • [1] Improving Short Answer Grading Using Transformer-Based Pre-training
    Sung, Chul
    Dhamecha, Tejas Indulal
    Mukhi, Nirmal
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2019), PT I, 2019, 11625 : 469 - 481
  • [2] Ouroboros: On Accelerating Training of Transformer-Based Language Models
    Yang, Qian
    Huo, Zhouyuan
    Wang, Wenlin
    Huang, Heng
    Carin, Lawrence
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Survey: Transformer based video-language pre-training
    Ruan, Ludan
    Jin, Qin
    [J]. AI OPEN, 2022, 3 : 1 - 13
  • [4] On Efficient Transformer-Based Image Pre-training for Low-Level Vision
    Li, Wenbo
    Lu, Xin
    Qian, Shengju
    Lu, Jiangbo
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1089 - 1097
  • [5] Pre-trained transformer-based language models for Sundanese
    Wilson Wongso
    Henry Lucky
    Derwin Suhartono
    [J]. Journal of Big Data, 9
  • [6] Pre-trained transformer-based language models for Sundanese
    Wongso, Wilson
    Lucky, Henry
    Suhartono, Derwin
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [7] Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
    Zhang, Minjia
    He, Yuxiong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [8] No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
    Kaddour, Jean
    Key, Oscar
    Nawrot, Piotr
    Minervini, Pasquale
    Kusner, Matt J.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] LETCP: A Label-Efficient Transformer-Based Contrastive Pre-Training Method for Brain Tumor Segmentation
    Chen, Shoucun
    Zhang, Jing
    Zhang, Tianchi
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [10] Improving the Sample Efficiency of Pre-training Language Models
    Berend, Gabor
    [J]. ERCIM NEWS, 2024, (136): : 38 - 40