Evaluation of Open-Source Large Language Models for Metal-Organic Frameworks Research

被引:5
|
作者
Bai, Xuefeng [1 ,2 ]
Xie, Yabo [1 ,2 ]
Zhang, Xin [1 ,2 ]
Han, Honggui [3 ,4 ]
Li, Jian-Rong [1 ,2 ]
机构
[1] Beijing Univ Technol, Coll Mat Sci & Engn, Beijing Key Lab Green Catalysis & Separat, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Coll Mat Sci & Engn, Dept Chem Engn, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Fac Informat Technol, Engn Res Ctr Digital Community, Beijing Lab Urban Mass Transit,Minist Educ, Beijing 100124, Peoples R China
[4] Beijing Univ Technol, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Compendex;
D O I
10.1021/acs.jcim.4c00065
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Along with the development of machine learning, deep learning, and large language models (LLMs) such as GPT-4 (GPT: Generative Pre-Trained Transformer), artificial intelligence (AI) tools have been playing an increasingly important role in chemical and material research to facilitate the material screening and design. Despite the exciting progress of GPT-4 based AI research assistance, open-source LLMs have not gained much attention from the scientific community. This work primarily focused on metal-organic frameworks (MOFs) as a subdomain of chemistry and evaluated six top-rated open-source LLMs with a comprehensive set of tasks including MOFs knowledge, basic chemistry knowledge, in-depth chemistry knowledge, knowledge extraction, database reading, predicting material property, experiment design, computational scripts generation, guiding experiment, data analysis, and paper polishing, which covers the basic units of MOFs research. In general, these LLMs were capable of most of the tasks. Especially, Llama2-7B and ChatGLM2-6B were found to perform particularly well with moderate computational resources. Additionally, the performance of different parameter versions of the same model was compared, which revealed the superior performance of higher parameter versions.
引用
收藏
页码:4958 / 4965
页数:8
相关论文
共 50 条
  • [1] Servicing open-source large language models for oncology
    Ray, Partha Pratim
    ONCOLOGIST, 2024,
  • [2] A tutorial on open-source large language models for behavioral science
    Hussain, Zak
    Binz, Marcel
    Mata, Rui
    Wulff, Dirk U.
    BEHAVIOR RESEARCH METHODS, 2024, : 8214 - 8237
  • [3] Metal-organic open frameworks (MOFs)
    Kaskel, S
    Schüth, F
    Stöcker, M
    MICROPOROUS AND MESOPOROUS MATERIALS, 2004, 73 (1-2) : 1 - 1
  • [4] Preliminary Systematic Review of Open-Source Large Language Models in Education
    Lin, Michael Pin-Chuan
    Chang, Daniel
    Hall, Sarah
    Jhajj, Gaganpreet
    GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 68 - 77
  • [5] ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models
    Kang, Yeonghun
    Kim, Jihan
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [6] PharmaLLM: A Medicine Prescriber Chatbot Exploiting Open-Source Large Language Models
    Ayesha Azam
    Zubaira Naz
    Muhammad Usman Ghani Khan
    Human-Centric Intelligent Systems, 2024, 4 (4): : 527 - 544
  • [7] Automated Essay Scoring and Revising Based on Open-Source Large Language Models
    Song, Yishen
    Zhu, Qianta
    Wang, Huaibo
    Zheng, Qinhua
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 1920 - 1930
  • [8] Open-source large language models in action: A bioinformatics chatbot for PRIDE database
    Bai, Jingwen
    Kamatchinathan, Selvakumar
    Kundu, Deepti J.
    Bandla, Chakradhar
    Vizcaino, Juan Antonio
    Perez-Riverol, Yasset
    PROTEOMICS, 2024,
  • [9] Open-source large language models in medical education: Balancing promise and challenges
    Ray, Partha Pratim
    ANATOMICAL SCIENCES EDUCATION, 2024, 17 (06) : 1361 - 1362
  • [10] Accessible Russian Large Language Models: Open-Source Models and Instructive Datasets for Commercial Applications
    Kosenko, D. P.
    Kuratov, Yu. M.
    Zharikova, D. R.
    DOKLADY MATHEMATICS, 2023, 108 (SUPPL 2) : S393 - S398