Large language models for causal hypothesis generation in science

被引:0
|
作者
Cohrs, Kai-Hendrik [1 ]
Diaz, Emiliano [1 ]
Sitokonstantinou, Vasileios [1 ]
Varando, Gherardo [1 ]
Camps-Valls, Gustau [1 ]
机构
[1] Univ Valencia, Image Proc Lab IPL, Valencia, Spain
来源
基金
欧洲研究理事会;
关键词
causality; large language models; hypothesis generation; science; causal discovery; LEARNING ALGORITHMS; NETWORK STRUCTURES; EQUIVALENCE; SEARCH;
D O I
10.1088/2632-2153/ada47f
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Towards the goal of understanding the causal structure underlying complex systems-such as the Earth, the climate, or the brain-integrating Large language models (LLMs) with data-driven and domain-expertise-driven approaches has the potential to become a game-changer, especially in data and expertise-limited scenarios. Debates persist around LLMs' causal reasoning capacities. However, rather than engaging in philosophical debates, we propose integrating LLMs into a scientific framework for causal hypothesis generation alongside expert knowledge and data. Our goals include formalizing LLMs as probabilistic imperfect experts, developing adaptive methods for causal hypothesis generation, and establishing universal benchmarks for comprehensive comparisons. Specifically, we introduce a spectrum of integration methods for experts, LLMs, and data-driven approaches. We review existing approaches for causal hypothesis generation and classify them within this spectrum. As an example, our hybrid (LLM + data) causal discovery algorithm illustrates ways for deeper integration. Characterizing imperfect experts along dimensions such as (1) reliability, (2) consistency, (3) uncertainty, and (4) content vs. reasoning are emphasized for developing adaptable methods. Lastly, we stress the importance of model-agnostic benchmarks.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Automating psychological hypothesis generation with AI: when large language models meet causal graph
    Tong, Song
    Mao, Kai
    Huang, Zhen
    Zhao, Yukun
    Peng, Kaiping
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2024, 11 (01):
  • [2] Causal Reasoning in Large Language Models using Causal Graph Retrieval Augmented Generation
    Samarajeewa, Chamod
    De Silva, Daswin
    Osipov, Evgeny
    Alahakoon, Damminda
    Manic, Milos
    2024 16TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, HSI 2024, 2024,
  • [3] Large language models in science
    Kowalewski, Karl-Friedrich
    Rodler, Severin
    UROLOGIE, 2024, 63 (09): : 860 - 866
  • [4] Causal Dataset Discovery with Large Language Models
    Liu, Junfei
    Sun, Shaotong
    Nargesian, Fatemeh
    WORKSHOP ON HUMAN-IN-THE-LOOP DATA ANALYTICS, HILDA 2024, 2024,
  • [5] Large Language Models in der WissenschaftLarge language models in science
    Karl-Friedrich Kowalewski
    Severin Rodler
    Die Urologie, 2024, 63 (9) : 860 - 866
  • [6] Large language models and political science
    Linegar, Mitchell
    Kocielnik, Rafal
    Alvarez, R. Michael
    FRONTIERS IN POLITICAL SCIENCE, 2023, 5
  • [7] Science in the age of large language models
    Birhane, Abeba
    Kasirzadeh, Atoosa
    Leslie, David
    Wachter, Sandra
    NATURE REVIEWS PHYSICS, 2023, 5 (05) : 277 - 280
  • [8] Large language models for science and medicine
    Telenti, Amalio
    Auli, Michael
    Hie, Brian L.
    Maher, Cyrus
    Saria, Suchi
    Ioannidis, John P. A.
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2024, 54 (06)
  • [9] Science in the age of large language models
    Abeba Birhane
    Atoosa Kasirzadeh
    David Leslie
    Sandra Wachter
    Nature Reviews Physics, 2023, 5 (5) : 277 - 280
  • [10] A Causal View of Entity Bias in (Large) Language Models
    Wang, Fei
    Mo, Wenjie
    Wang, Yiwei
    Zhou, Wenxuan
    Chen, Muhao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 15173 - 15184