A qualitative assessment of using ChatGPT as large language model for scientific workflow development

被引:1
|
作者
Saenger, Mario [1 ]
De Mecquenem, Ninon [1 ]
Lewinska, Katarzyna Ewa [2 ,3 ]
Bountris, Vasilis [1 ]
Lehmann, Fabian [1 ]
Leser, Ulf [1 ]
Kosch, Thomas [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-10099 Berlin, Germany
[2] Humboldt Univ, Dept Geog, D-10099 Berlin, Germany
[3] Univ Wisconsin Madison, Dept Forest & Wildlife Ecol, Madison, WI 53706 USA
来源
GIGASCIENCE | 2024年 / 13卷
关键词
large language models; scientific workflows; user support; ChatGPT; END-USER DEVELOPMENT; GENERATION; ALIGNMENT; FUTURE;
D O I
10.1093/gigascience/giae030
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Scientific workflow systems are increasingly popular for expressing and executing complex data analysis pipelines over large datasets, as they offer reproducibility, dependability, and scalability of analyses by automatic parallelization on large compute clusters. However, implementing workflows is difficult due to the involvement of many black-box tools and the deep infrastructure stack necessary for their execution. Simultaneously, user-supporting tools are rare, and the number of available examples is much lower than in classical programming languages.Results To address these challenges, we investigate the efficiency of large language models (LLMs), specifically ChatGPT, to support users when dealing with scientific workflows. We performed 3 user studies in 2 scientific domains to evaluate ChatGPT for comprehending, adapting, and extending workflows. Our results indicate that LLMs efficiently interpret workflows but achieve lower performance for exchanging components or purposeful workflow extensions. We characterize their limitations in these challenging scenarios and suggest future research directions.Conclusions Our results show a high accuracy for comprehending and explaining scientific workflows while achieving a reduced performance for modifying and extending workflow descriptions. These findings clearly illustrate the need for further research in this area.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation
    Zhang, Jizhi
    Bao, Keqin
    Zhang, Yang
    Wang, Wenjie
    Feng, Fuli
    He, Xiangnan
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 993 - 999
  • [22] Response Performance Evaluations of ChatGPT Models on Large Language Model Frameworks
    Kaplan, Alper
    Sayan, Ismail Utku
    Saban, Huseyin
    Begen, Emre
    Bayrak, Ahmet Tugrul
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [23] Large language model (ChatGPT) as a support tool for breast tumor board
    Vera Sorin
    Eyal Klang
    Miri Sklair-Levy
    Israel Cohen
    Douglas B. Zippel
    Nora Balint Lahat
    Eli Konen
    Yiftach Barash
    npj Breast Cancer, 9
  • [24] Performance of the ChatGPT large language model for decision support in community pharmacy
    Shin, Euibeom
    Hartman, Maggie
    Ramanathan, Murali
    BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2024, 90 (12) : 3320 - 3333
  • [25] Transforming Educational Assessment: Insights Into the Use of ChatGPT and Large Language Models in Grading
    Kooli, Chokri
    Yusuf, Nadia
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2025, 41 (05) : 3388 - 3399
  • [26] A Prompt for Generating Script Concordance Test Using ChatGPT, Claude, and Llama Large Language Model Chatbots
    Kiyak, Yavuz Selim
    Emekli, Emre
    SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (03):
  • [27] The potential and pitfalls of using a large language model such as ChatGPT, GPT-4, or LLaMA as a clinical assistant
    Zhang, Jingqing
    Sun, Kai
    Jagadeesh, Akshay
    Falakaflaki, Parastoo
    Kayayan, Elena
    Tao, Guanyu
    Ghahfarokhi, Mahta Haghighat
    Gupta, Deepa
    Gupta, Ashok
    Gupta, Vibhor
    Guo, Yike
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09) : 1884 - 1891
  • [28] Application of Large Language Models in Medical Training Evaluation-Using ChatGPT as a Standardized Patient: Multimetric Assessment
    Wang, Chenxu
    Li, Shuhan
    Lin, Nuoxi
    Zhang, Xinyu
    Han, Ying
    Wang, Xiandi
    Liu, Di
    Tan, Xiaomei
    Pu, Dan
    Li, Kang
    Qian, Guangwu
    Yin, Rong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2025, 27
  • [29] Disability Expertise and Large Language Models: A Qualitative Study of Autistic TikTok Creators' Use of ChatGPT
    Mc Nally, Kellan
    Wright, Kathryn
    Goldkind, Lauri
    Kattari, Shanna K.
    Victor, Bryan G.
    SOCIAL MEDIA + SOCIETY, 2024, 10 (03):
  • [30] Scientific workflow execution in the cloud using a dynamic runtime model
    Johannes Erbel
    Jens Grabowski
    Software and Systems Modeling, 2024, 23 : 163 - 193