SeaLLMs - Large Language Models for Southeast Asia

被引:0
|
作者
Xuan-Phi Nguyen [1 ]
Zhang, Wenxuan [1 ]
Li, Xin [1 ]
Aljunied, Mahani [1 ]
Hu, Zhiqiang [1 ]
Shen, Chenhui [1 ]
Chia, Yew Ken [1 ]
Li, Xingxuan [1 ]
Wang, Jianyu [1 ]
Tan, Qingyu [1 ]
Cheng, Liying [1 ]
Chen, Guanzheng [1 ]
Deng, Yue [1 ]
Yang, Sen [1 ]
Liu, Chaoqun [1 ]
Zhang, Hang [1 ]
Bing, Lidong [1 ]
机构
[1] Alibaba Grp, DAMO Acad, Hangzhou, Zhejiang, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages. To address this imbalance, we introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian (SEA) languages. SeaLLMs are built upon popular English-centric models through continued pre-training with an extended vocabulary, specialized instruction and alignment tuning to better capture the intricacies of regional languages. This allows them to respect and reflect local cultural norms, customs, stylistic preferences, and legal considerations. Our comprehensive evaluation demonstrates that SeaLLM models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. Moreover, they outperform ChatGPT-3.5 in non-Latin languages, such as Thai, Khmer, Lao, and Burmese, by large margins while remaining lightweight and cost-effective to operate.
引用
收藏
页码:294 / 304
页数:11
相关论文
共 50 条
  • [42] The Importance of Understanding Language in Large Language Models
    Youssef, Alaa
    Stein, Samantha
    Clapp, Justin
    Magnus, David
    AMERICAN JOURNAL OF BIOETHICS, 2023, 23 (10): : 6 - 7
  • [43] Dissociating language and thought in large language models
    Mahowald, Kyle
    Ivanova, Anna A.
    Blank, Idan A.
    Kanwisher, Nancy
    Tenenbaum, Joshua B.
    Fedorenko, Evelina
    TRENDS IN COGNITIVE SCIENCES, 2024, 28 (06) : 517 - 540
  • [44] On the creativity of large language models
    Franceschelli, Giorgio
    Musolesi, Mirco
    AI & SOCIETY, 2024,
  • [45] Large language models and psychiatry
    Orru, Graziella
    Melis, Giulia
    Sartori, Giuseppe
    INTERNATIONAL JOURNAL OF LAW AND PSYCHIATRY, 2025, 101
  • [46] Large Language Models in Cyberattacks
    S. V. Lebed
    D. E. Namiot
    E. V. Zubareva
    P. V. Khenkin
    A. A. Vorobeva
    D. A. Svichkar
    Doklady Mathematics, 2024, 110 (Suppl 2) : S510 - S520
  • [47] Autoformalization with Large Language Models
    Wu, Yuhuai
    Jiang, Albert Q.
    Li, Wenda
    Rabe, Markus N.
    Staats, Charles
    Jamnik, Mateja
    Szegedy, Christian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [48] Imitation and Large Language Models
    Boisseau, Eloise
    MINDS AND MACHINES, 2024, 34 (04)
  • [49] The Smallness of Large Language Models
    Denning, Peter J.
    COMMUNICATIONS OF THE ACM, 2023, 66 (09) : 24 - 27
  • [50] Large language models in medicine
    Thirunavukarasu, Arun James
    Ting, Darren Shu Jeng
    Elangovan, Kabilan
    Gutierrez, Laura
    Tan, Ting Fang
    Ting, Daniel Shu Wei
    NATURE MEDICINE, 2023, 29 (08) : 1930 - 1940