OceanGPT: A Large Language Model for Ocean Science Tasks

被引:0
|
作者
Bi, Zhen [1 ,2 ,5 ,6 ]
Zhang, Ningyu [1 ,2 ,5 ]
Xue, Yida [1 ]
Ou, Yixin [1 ]
Ji, Daxiong [2 ,3 ]
Zheng, Guozhou [2 ,4 ]
Chen, Huajun [1 ,2 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
[2] Zhejiang Univ, Donghai Lab, Hangzhou, Peoples R China
[3] Zhejiang Univ, Ocean Coll, Hangzhou, Peoples R China
[4] Zhoushan Zhejiang Univ, Ocean Res Ctr, Hangzhou, Peoples R China
[5] Zhejiang Univ, Sch Software Technol, Hangzhou, Peoples R China
[6] Huzhou Univ, Huzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, and the potential of LLMs for ocean science is under-explored. The intrinsic reasons are the immense and intricate nature of ocean data as well as the necessity for higher granularity and richness in knowledge. To alleviate these issues, we introduce OCEANGPT, the first-ever large language model in the ocean domain, which is expert in various ocean science tasks. We also propose DOINSTRUCT, a novel framework to automatically obtain a large volume of ocean domain instruction data, which generates instructions based on multi-agent collaboration. Additionally, we construct the first oceanography benchmark, OCEANBENCH, to evaluate the capabilities of LLMs in the ocean domain. Though comprehensive experiments, OCEANGPT not only shows a higher level of knowledge expertise for oceans science tasks but also gains preliminary embodied intelligence capabilities in ocean technology.
引用
收藏
页码:3357 / 3372
页数:16
相关论文
共 50 条
  • [1] Benchmarking Large Language Model Performance on Natural Language Processing Tasks for Pharmacoepidemiology
    Feng, Hui
    Ronzano, Francesco
    LaFleur, JuDe
    Garber, Matthew L.
    de Oliveira, Rodrigo
    Roth, Katharine
    Rough, Kathryn
    Nanavati, Jay
    El Abidine, Khaldoun Zine
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 70 - 70
  • [2] Similarity-Based Prompt Construction for Large Language Model in Medical Tasks
    Liu, Gaofei
    Pan, Meiqi
    Ma, Zhiyuan
    Gu, Miaomiao
    Yang, Ling
    Qin, Jiwei
    HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 73 - 83
  • [3] Large language models in science
    Kowalewski, Karl-Friedrich
    Rodler, Severin
    UROLOGIE, 2024, 63 (09): : 860 - 866
  • [4] Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks
    Luo, Ling
    Ning, Jinzhong
    Zhao, Yingwen
    Wang, Zhijun
    Ding, Zeyuan
    Chen, Peng
    Fu, Weiru
    Han, Qinyu
    Xu, Guangtao
    Qiu, Yunzhi
    Pan, Dinghao
    Li, Jiru
    Li, Hao
    Feng, Wenduo
    Tu, Senbo
    Liu, Yuqi
    Yang, Zhihao
    Wang, Jian
    Sun, Yuanyuan
    Lin, Hongfei
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1865 - 1874
  • [5] LMentry: A Language Model Benchmark of Elementary Language Tasks
    Efrat, Avia
    Honovich, Or
    Levy, Omer
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10476 - 10501
  • [6] SkyEyeGPT: Unifying remote sensing vision-language tasks via instruction tuning with large language model
    Zhan, Yang
    Xiong, Zhitong
    Yuan, Yuan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 64 - 77
  • [7] MatChat: A large language model and application service platform for materials science
    Chen, Zi-Yi
    Xie, Fan-Kai
    Wan, Meng
    Yuan, Yang
    Liu, Miao
    Wang, Zong-Guo
    Meng, Sheng
    Wang, Yan-Gang
    CHINESE PHYSICS B, 2023, 32 (11)
  • [8] MatChat: A large language model and application service platform for materials science
    陈子逸
    谢帆恺
    万萌
    袁扬
    刘淼
    王宗国
    孟胜
    王彦棡
    Chinese Physics B, 2023, (11) : 208 - 213
  • [9] A language model for medical predictive tasks
    Fernando Chirigati
    Nature Computational Science, 2023, 3 : 576 - 576
  • [10] A language model for medical predictive tasks
    Chirigati, Fernando
    NATURE COMPUTATIONAL SCIENCE, 2023, 3 (07): : 576 - 576