Passive learning of active causal strategies in agents and language models

被引:0
|
作者
Lampinen, Andrew K. [1 ]
Chan, Stephanie C. Y. [1 ]
Dasgupta, Ishita [1 ]
Nam, Andrew J. [2 ]
Wang, Jane X. [1 ]
机构
[1] Google DeepMind, London, England
[2] Stanford Univ, Stanford, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that, under certain assumptions, learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from otherwise perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Two optimal strategies for active learning of causal models from interventional data
    Hauser, Alain
    Buehlmann, Peter
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (04) : 926 - 939
  • [2] Optimization of Active Learning Strategies for Causal Network Structure
    Zhang, Mengxin
    Zhang, Xiaojun
    [J]. MATHEMATICS, 2024, 12 (06)
  • [3] Active learning for optimal intervention design in causal models
    Jiaqi Zhang
    Louis Cammarata
    Chandler Squires
    Themistoklis P. Sapsis
    Caroline Uhler
    [J]. Nature Machine Intelligence, 2023, 5 : 1066 - 1075
  • [4] Active learning for optimal intervention design in causal models
    Zhang, Jiaqi
    Cammarata, Louis
    Squires, Chandler
    Sapsis, Themistoklis P.
    Uhler, Caroline
    [J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (10) : 1066 - +
  • [5] Intermeshing passive and active learning strategies in teaching biochemistry
    Matthews, JC
    [J]. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION, 1997, 61 (04) : 388 - 394
  • [6] Causal Learning in Joint Activity: Comparing Collaborative, Active, and Passive Contexts
    Young, Andrew G.
    Alibali, Martha W.
    Kalish, Charles W.
    [J]. COGNITION IN FLUX, 2010, : 628 - 628
  • [7] Analysis of a Human Meta-Strategy for Agents with Active and Passive Strategies
    Miyamoto, Kensuke
    Watanabe, Norifumi
    Nakamura, Osamu
    Takefuji, Yoshiyasu
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [8] COUPLING ACTIVE AND PASSIVE FLOW CONTROL STRATEGIES FOR SIMPLIFIED CAR MODELS
    Bruneau, Charles-Henri
    Creuse, Emmanuel
    Depeyras, Delphine
    Gillieron, Patrick
    Mortazavi, Iraj
    [J]. PROCEEDINGS OF THE ASME FLUIDS ENGINEERING DIVISION SUMMER CONFERENCE, VOL 1, PTS A-C, 2009, : 1527 - 1532
  • [9] Causal Distillation for Language Models
    Wu, Zhengxuan
    Geiger, Atticus
    Rozner, Joshua
    Kreiss, Elisa
    Lu, Hanson
    Icard, Thomas
    Potts, Christopher
    Goodman, Noah
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4288 - 4295
  • [10] LAMM: Language Aware Active Learning for Multilingual Models
    Ye, Ze
    Liu, Dantong
    Pavani, Kaushik
    Dasgupta, Sunny
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5255 - 5256