ViDeBERTa: A powerful pre-trained language model for Vietnamese

被引:0
|
作者
Tran, Cong Dao [1 ]
Pham, Nhut Huy [1 ]
Nguyen, Anh [2 ]
Hy, Truong Son [3 ]
Vu, Tu [4 ]
机构
[1] FPT Software AI Ctr, Hanoi, Vietnam
[2] Microsoft, Washington, DC USA
[3] Univ Calif San Diego, San Diego, CA 92103 USA
[4] Univ Massachusetts Amherst, Amherst, MA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa(xsmall), ViDeBERTa(base), and ViDeBERTalarge, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the English language, there are still few pre-trained models for Vietnamese, a low-resource language, that perform good results on downstream tasks, especially Question answering. We fine-tune and evaluate our model on three important natural language downstream tasks, Part-of-speech tagging, Named-entity recognition, and Question answering. The empirical results demonstrate that ViDeBERTa with far fewer parameters surpasses the previous state-of-the-art models on multiple Vietnamese-specific natural language understanding tasks. Notably, ViDeBERTabase with 86M parameters, which is only about 23% of PhoBERT(large) with 370M parameters, still performs the same or better results than the previous state-of-the-art model. Our ViDeBERTa models are available at: https://github.com/HySonLab/ViDeBERTa.
引用
收藏
页码:1071 / 1078
页数:8
相关论文
共 50 条
  • [1] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [2] A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese
    Liao, Xianwen
    Huang, Yongzhong
    Yang, Peng
    Chen, Lei
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [3] Hyperbolic Pre-Trained Language Model
    Chen, Weize
    Han, Xu
    Lin, Yankai
    He, Kaichen
    Xie, Ruobing
    Zhou, Jie
    Liu, Zhiyuan
    Sun, Maosong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
  • [4] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
  • [5] ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining
    Minh Phuc Nguyen
    Vu Hoang Tran
    Vu Hoang
    Ta Duc Huy
    Bui, Trung H.
    Truong, Steven Q. H.
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 328 - 337
  • [6] A Study of Vietnamese Sentiment Classification with Ensemble Pre-trained Language Models
    Thin, Dang Van
    Hao, Duong Ngoc
    Nguyen, Ngan Luu-Thuy
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024, 11 (01) : 137 - 165
  • [7] Adder Encoder for Pre-trained Language Model
    Ding, Jianbang
    Zhang, Suiyun
    Li, Linlin
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 339 - 347
  • [8] Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback
    Loc, Cu Vinh
    Viet, Truong Xuan
    Viet, Tran Hoang
    Thao, Le Hoang
    Viet, Nguyen Hoang
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2023, 22 (03)
  • [9] Surgicberta: a pre-trained language model for procedural surgical language
    Bombieri, Marco
    Rospocher, Marco
    Ponzetto, Simone Paolo
    Fiorini, Paolo
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
  • [10] Error Investigation of Pre-trained BERTology Models on Vietnamese Natural Language Inference
    Tin Van Huynh
    Huy Quoc To
    Kiet Van Nguyen
    Ngan Luu-Thuy Nguyen
    [J]. RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 176 - 188