Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish

被引:0
|
作者
Ekgren, Ariel [1 ]
Gyllensten, Amaru Cuba [2 ]
Gogoulou, Evangelia [2 ]
Heiman, Alice [1 ]
Verlinden, Severine [1 ]
Ohman, Joey [1 ]
Carlsson, Fredrik [2 ]
Sahlgren, Magnus [1 ]
机构
[1] AI Sweden, Lund, Sweden
[2] RISE, Gothenburg, Sweden
关键词
Language models; Evaluation; Prompting;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present GPT-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regard to data collection and training process, and discusses the challenges of proper evaluation. The results of quantitive evaluation using perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GPT-SW3.
引用
收藏
页码:3509 / 3518
页数:10
相关论文
共 50 条
  • [1] Lessons Learned from Large-Scale Refactoring
    Wright, Hyrum K.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2019), 2019, : 366 - 366
  • [2] Developing a Large-Scale Hybrid Simulation Model Lessons Learned
    Zitzow, Stephen
    Lehrke, Derek
    Hourdos, John
    [J]. TRANSPORTATION RESEARCH RECORD, 2015, (2491) : 107 - 116
  • [3] Lessons Learned from Large-Scale Aerospace Structural Testing
    Lovejoy, Andrew E.
    Jegley, Dawn C.
    Hilburger, Mark W.
    Przekop, Adam
    [J]. AIAA JOURNAL, 2023, 61 (11) : 5110 - 5120
  • [4] LESSONS LEARNED FROM FLIPPING A LARGE-SCALE PROGRAMMING COURSE
    Wilson, S.
    [J]. 11TH INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION (ICERI2018), 2018, : 3594 - 3601
  • [5] GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
    Yoo, Kang Min
    Park, Dongju
    Kang, Jaewook
    Lee, Sang-Woo
    Park, Woomyeong
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2225 - 2239
  • [6] To Plan or Not to Plan: Lessons Learned from Building Large Scale Social Simulations
    Bogdanovych, Anton
    Trescak, Tomas
    [J]. INTELLIGENT VIRTUAL AGENTS, IVA 2017, 2017, 10498 : 53 - 62
  • [7] CPM: A large-scale generative Chinese Pre-trained language model
    Zhang, Zhengyan
    Han, Xu
    Zhou, Hao
    Ke, Pei
    Gu, Yuxian
    Ye, Deming
    Qin, Yujia
    Su, Yusheng
    Ji, Haozhe
    Guan, Jian
    Qi, Fanchao
    Wang, Xiaozhi
    Zheng, Yanan
    Zeng, Guoyang
    Cao, Huanqi
    Chen, Shengqi
    Li, Daixuan
    Sun, Zhenbo
    Liu, Zhiyuan
    Huang, Minlie
    Han, Wentao
    Tang, Jie
    Li, Juanzi
    Zhu, Xiaoyan
    Sun, Maosong
    [J]. AI OPEN, 2021, 2 : 93 - 99
  • [8] Building Participation in Large-scale Conservation: Lessons from Belize and Panama
    Hastings, Jesse Guite
    [J]. CONSERVATION & SOCIETY, 2015, 13 (03): : 221 - 231
  • [9] Large-scale regional assessments: Lessons learned from the Southern Appalachian Assessment
    Wear, DN
    [J]. ECOLOGICAL MODELING FOR RESOURCE MANAGEMENT, 2003, : 70 - 85
  • [10] Lessons Learned from Large-Scale, First-Tier Clinical Exome Sequencing in a Highly Consanguineous Population
    Monies, Dorota
    Abouelhoda, Mohammed
    Assoum, Mirna
    Moghrabi, Nabil
    Rafiullah, Rafiullah
    Almontashiri, Naif
    Alowain, Mohammed
    Alzaidan, Hamad
    Alsayed, Moeen
    Subhani, Shazia
    Cupler, Edward
    Faden, Maha
    Alhashem, Amal
    Qari, Alya
    Chedrawi, Aziza
    Aldhalaan, Hisham
    Kurdi, Wesam
    Khan, Sameena
    Rahbeeni, Zuhair
    Alotaibi, Maha
    Goljan, Ewa
    Elbardisy, Hadeel
    ElKalioby, Mohamed
    Shah, Zeeshan
    Alruwaili, Hibah
    Jaafar, Amal
    Albar, Ranad
    Akilan, Asma
    Tayeb, Hamsa
    Tahir, Asma
    Fawzy, Mohammed
    Nasr, Mohammed
    Makki, Shaza
    Alfaifi, Abdullah
    Akleh, Hanna
    Yamani, Suad
    Bubshait, Dalal
    Mahnashi, Mohammed
    Basha, Talal
    Alsagheir, Afaf
    Abu Khaled, Musad
    Alsaleem, Khalid
    Almugbel, Maisoon
    Badawi, Manal
    Bashiri, Fahad
    Bohlega, Saeed
    Sulaiman, Raashida
    Tous, Ehab
    Ahmed, Syed
    Algoufi, Talal
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2019, 104 (06) : 1182 - 1201