IgboBERT Models: Building and Training Transformer Models for the Igbo Language

被引:0
|
作者
Chukwuneke, Chiamaka [1 ,2 ]
Ezeani, Ignatius [1 ,2 ]
Rayson, Paul [1 ]
El-Haj, Mahmoud [1 ]
机构
[1] Univ Lancaster, UCREL NLP Grp, Lancaster, England
[2] Nnamdi Azikiwe Univ, Dept Comp Sc, Anambra State, Nigeria
关键词
Igbo; named entity recognition; BERT models; under-resourced; dataset;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This work presents a standard Igbo named entity recognition (IgboNER) dataset as well as the results from training and fine-tuning state-of-the-art transformer IgboNER models. We discuss the process of our dataset creation - data collection and annotation and quality checking. We also present experimental processes involved in building an IgboBERT language model from scratch as well as fine-tuning it along with other non-Igbo pre-trained models for the downstream IgboNER task. Our results show that, although the IgboNER task benefited hugely from fine-tuning large transformer model, fine-tuning a transformer model built from scratch with comparatively little Igbo text data seems to yield quite decent results for the IgboNER task. This work will contribute immensely to IgboNLP in particular as well as the wider African and low-resource NLP efforts.
引用
收藏
页码:5114 / 5122
页数:9
相关论文
共 50 条
  • [1] Staged Training for Transformer Language Models
    Shen, Sheng
    Walsh, Pete
    Keutzer, Kurt
    Dodge, Jesse
    Peters, Matthew
    Beltagy, Iz
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Ouroboros: On Accelerating Training of Transformer-Based Language Models
    Yang, Qian
    Huo, Zhouyuan
    Wang, Wenlin
    Huang, Heng
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] LVCSR with Transformer Language Models
    Beck, Eugen
    Schlueter, Ralf
    Ney, Hermann
    INTERSPEECH 2020, 2020, : 1798 - 1802
  • [4] Training Set Selection for Building Compact and Efficient Language Models
    Yasuda, Keiji
    Yamamoto, Hirofumi
    Sumita, Eiichiro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03) : 506 - 511
  • [5] Disentangling Transformer Language Models as Superposed Topic Models
    Lim, Jia Peng
    Lauw, Hady W.
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 8646 - 8666
  • [6] When Language Models Fall in Love: Animacy Processing in Transformer Language Models
    Hanna, Michael
    Belinkov, Yonatan
    Pezzelle, Sandro
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12120 - 12135
  • [7] Pre-training and Evaluating Transformer-based Language Models for Icelandic
    Daoason, Jon Friorik
    Loftsson, Hrafn
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7386 - 7391
  • [8] Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
    Zhang, Minjia
    He, Yuxiong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [9] Structural Guidance for Transformer Language Models
    Qian, Peng
    Naseem, Tahira
    Levy, Roger
    Astudillo, Ramon Fernandez
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3735 - 3745
  • [10] Molecular language models: RNNs or transformer?
    Chen, Yangyang
    Wang, Zixu
    Zeng, Xiangxiang
    Li, Yayang
    Li, Pengyong
    Ye, Xiucai
    Sakurai, Tetsuya
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2023, 22 (04) : 392 - 400