Injecting structural hints: Using language models to study inductive biases in language learning

被引:0
|
作者
Papadimitriou, Isabel [1 ]
Jurafsky, Dan [1 ]
机构
[1] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologicallydiverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in human language learning. We investigate the effect of injecting models with three types of inductive bias: 1) recursive, hierarchical processing, 2) crossing token-token relationships that can't be modeled by contextfree grammars, and 3) a Zipfian power-law vocabulary distribution. We show that noncontext-free relationships form the best inductive biases. Our study leverages the capabilities of transformer models to run controlled language learning experiments that are not possible to run on humans, and surfaces hypotheses about the structures that facilitate language learning in both humans and machines.
引用
收藏
页码:8402 / 8413
页数:12
相关论文
共 50 条
  • [41] Race, Gender, and Age Biases in Biomedical Masked Language Models
    Kim, Michelle YoungJin
    Kim, Junghwan
    Johnson, Kristen Marie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11806 - 11815
  • [42] An Investigation of Applying Large Language Models to Spoken Language Learning
    Gao, Yingming
    Nuchged, Baorian
    Li, Ya
    Peng, Linkai
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [43] Large Language Models Demonstrate the Potential of Statistical Learning in Language
    Contreras Kallens, Pablo
    Kristensen-McLachlan, Ross Deans
    Christiansen, Morten H.
    COGNITIVE SCIENCE, 2023, 47 (03) : e13256
  • [44] Shortcut Learning of Large Language Models in Natural Language Understanding
    Du, Mengnan
    He, Fengxiang
    Zou, Na
    Tao, Dacheng
    Hu, Xia
    COMMUNICATIONS OF THE ACM, 2024, 67 (01) : 110 - 120
  • [45] Exploring student perceptions of language learning affordances of Large Language Models: A Q methodology study
    Li, Ke
    Lun, Lulu
    Hu, Pingping
    EDUCATION AND INFORMATION TECHNOLOGIES, 2025,
  • [46] STRUCTURAL MODELS OF SOMATISMS IN THE NORWEGIAN LANGUAGE
    Popova, Olga
    SKANDINAVSKAYA FILOLOGIYA, 2024, 22 (01):
  • [47] Detecting implicit biases of large language models with Bayesian hypothesis testingDetecting Implicit Biases of Large Language Models...S. Si et al.
    Shijing Si
    Xiaoming Jiang
    Qinliang Su
    Lawrence Carin
    Scientific Reports, 15 (1)
  • [48] Structural Guidance for Transformer Language Models
    Qian, Peng
    Naseem, Tahira
    Levy, Roger
    Astudillo, Ramon Fernandez
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3735 - 3745
  • [49] SynthLog: A Language for Synthesising Inductive Data Models (Extended Abstract)
    Dauxais, Yann
    Gautrais, Clement
    Dries, Anton
    Jain, Arcchit
    Kolb, Samuel
    Kumar, Mohit
    Teso, Stefano
    Van Wolputte, Elia
    Verbruggen, Gust
    De Raedt, Luc
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 1167 : 102 - 110
  • [50] Examining the Inductive Bias of Neural Language Models with Artificial Languages
    White, Jennifer C.
    Cotterell, Ryan
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 454 - 463