Injecting structural hints: Using language models to study inductive biases in language learning

被引:0
|
作者
Papadimitriou, Isabel [1 ]
Jurafsky, Dan [1 ]
机构
[1] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologicallydiverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in human language learning. We investigate the effect of injecting models with three types of inductive bias: 1) recursive, hierarchical processing, 2) crossing token-token relationships that can't be modeled by contextfree grammars, and 3) a Zipfian power-law vocabulary distribution. We show that noncontext-free relationships form the best inductive biases. Our study leverages the capabilities of transformer models to run controlled language learning experiments that are not possible to run on humans, and surfaces hypotheses about the structures that facilitate language learning in both humans and machines.
引用
收藏
页码:8402 / 8413
页数:12
相关论文
共 50 条
  • [21] (Ir)rationality and cognitive biases in large language models
    Macmillan-Scott, Olivia
    Musolesi, Mirco
    ROYAL SOCIETY OPEN SCIENCE, 2024, 11 (06):
  • [22] Evaluation and mitigation of cognitive biases in medical language models
    Schmidgall, Samuel
    Harris, Carl
    Essien, Ime
    Olshvang, Daniel
    Rahman, Tawsifur
    Kim, Ji Woong
    Ziaei, Rojin
    Eshraghian, Jason
    Abadir, Peter
    Chellappa, Rama
    NPJ DIGITAL MEDICINE, 2024, 7 (01):
  • [23] Counterexample Guided Inductive Synthesis Using Large Language Models and Satisfiability Solving
    Jha, Sumit Kumar
    Jha, Susmit
    Lincoln, Patrick
    Bastian, Nathaniel D.
    Velasquez, Alvaro
    Ewetz, Rickard
    Neema, Sandeep
    MILCOM 2023 - 2023 IEEE MILITARY COMMUNICATIONS CONFERENCE, 2023,
  • [24] Structural Language Models of Code
    Alon, Uri
    Sadaka, Roy
    Levy, Omer
    Yahav, Eran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [25] Models of language acquisition: Inductive and deductive approaches.
    Ishikawa, M
    WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 2003, 54 (02): : 270 - 274
  • [26] Models of language acquisition: Inductive and deductive approaches.
    Walicek, DE
    Broeder, P
    Murre, J
    LANGUAGE, 2005, 81 (02) : 513 - 514
  • [27] Models of language acquisition: Inductive and deductive approaches.
    Myers, SA
    LANGUAGE, 2003, 79 (02) : 430 - 430
  • [28] Models of language acquisition: inductive anddeductive approaches.
    Fletcher, P
    JOURNAL OF LINGUISTICS, 2004, 40 (02) : 400 - 405
  • [29] Backdoor Learning of Language Models in Natural Language Processing
    University of Michigan
    1600,
  • [30] Language quotient (LQ): new models of language learning
    Ilyas, Mohammed
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2016, 3 (09): : 44 - 50