OceanGPT: A Large Language Model for Ocean Science Tasks

被引：0

作者：

Bi, Zhen ^{[1
,2
,5
,6
]}

Zhang, Ningyu ^{[1
,2
,5
]}

Xue, Yida ^{[1
]}

Ou, Yixin ^{[1
]}

Ji, Daxiong ^{[2
,3
]}

Zheng, Guozhou ^{[2
,4
]}

Chen, Huajun ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China

[2] Zhejiang Univ, Donghai Lab, Hangzhou, Peoples R China

[3] Zhejiang Univ, Ocean Coll, Hangzhou, Peoples R China

[4] Zhoushan Zhejiang Univ, Ocean Res Ctr, Hangzhou, Peoples R China

[5] Zhejiang Univ, Sch Software Technol, Hangzhou, Peoples R China

[6] Huzhou Univ, Huzhou, Peoples R China

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, and the potential of LLMs for ocean science is under-explored. The intrinsic reasons are the immense and intricate nature of ocean data as well as the necessity for higher granularity and richness in knowledge. To alleviate these issues, we introduce OCEANGPT, the first-ever large language model in the ocean domain, which is expert in various ocean science tasks. We also propose DOINSTRUCT, a novel framework to automatically obtain a large volume of ocean domain instruction data, which generates instructions based on multi-agent collaboration. Additionally, we construct the first oceanography benchmark, OCEANBENCH, to evaluate the capabilities of LLMs in the ocean domain. Though comprehensive experiments, OCEANGPT not only shows a higher level of knowledge expertise for oceans science tasks but also gains preliminary embodied intelligence capabilities in ocean technology.

引用

页码：3357 / 3372

页数：16

共 50 条

[1] Benchmarking Large Language Model Performance on Natural Language Processing Tasks for Pharmacoepidemiology
Feng, Hui
Ronzano, Francesco
LaFleur, JuDe
Garber, Matthew L.
de Oliveira, Rodrigo
Roth, Katharine
Rough, Kathryn
Nanavati, Jay
El Abidine, Khaldoun Zine
PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 : 70 - 70
[2] Similarity-Based Prompt Construction for Large Language Model in Medical Tasks
Liu, Gaofei
Pan, Meiqi
Ma, Zhiyuan
Gu, Miaomiao
Yang, Ling
Qin, Jiwei
HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 73 - 83
[3] Large language models in science
Kowalewski, Karl-Friedrich
Rodler, Severin
UROLOGIE, 2024, 63 (09): : 860 - 866
[4] Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks
Luo, Ling
Ning, Jinzhong
Zhao, Yingwen
Wang, Zhijun
Ding, Zeyuan
Chen, Peng
Fu, Weiru
Han, Qinyu
Xu, Guangtao
Qiu, Yunzhi
Pan, Dinghao
Li, Jiru
Li, Hao
Feng, Wenduo
Tu, Senbo
Liu, Yuqi
Yang, Zhihao
Wang, Jian
Sun, Yuanyuan
Lin, Hongfei
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1865 - 1874
[5] LMentry: A Language Model Benchmark of Elementary Language Tasks
Efrat, Avia
Honovich, Or
Levy, Omer
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10476 - 10501
[6] SkyEyeGPT: Unifying remote sensing vision-language tasks via instruction tuning with large language model
Zhan, Yang
Xiong, Zhitong
Yuan, Yuan
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 64 - 77
[7] MatChat: A large language model and application service platform for materials science
Chen, Zi-Yi
Xie, Fan-Kai
Wan, Meng
Yuan, Yang
Liu, Miao
Wang, Zong-Guo
Meng, Sheng
Wang, Yan-Gang
CHINESE PHYSICS B, 2023, 32 (11)
[8] MatChat: A large language model and application service platform for materials science
陈子逸
谢帆恺
万萌
袁扬
刘淼
王宗国
孟胜
王彦棡
Chinese Physics B, 2023, (11) : 208 - 213
[9] A language model for medical predictive tasks
Fernando Chirigati
Nature Computational Science, 2023, 3 : 576 - 576
[10] A language model for medical predictive tasks
Chirigati, Fernando
NATURE COMPUTATIONAL SCIENCE, 2023, 3 (07): : 576 - 576

← 1 2 3 4 5 →