Injecting structural hints: Using language models to study inductive biases in language learning

被引:0
|
作者
Papadimitriou, Isabel [1 ]
Jurafsky, Dan [1 ]
机构
[1] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologicallydiverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in human language learning. We investigate the effect of injecting models with three types of inductive bias: 1) recursive, hierarchical processing, 2) crossing token-token relationships that can't be modeled by contextfree grammars, and 3) a Zipfian power-law vocabulary distribution. We show that noncontext-free relationships form the best inductive biases. Our study leverages the capabilities of transformer models to run controlled language learning experiments that are not possible to run on humans, and surfaces hypotheses about the structures that facilitate language learning in both humans and machines.
引用
收藏
页码:8402 / 8413
页数:12
相关论文
共 50 条
  • [31] AI Language Models: An Opportunity to Enhance Language Learning
    Cong, Yan
    INFORMATICS-BASEL, 2024, 11 (03):
  • [32] Learning diagnostic models using speech and language measures
    Peintner, Bart
    Jarrold, William
    Vergyri, Dimitra
    Richey, Colleen
    Tempini, Maria Luisa Gorno
    Ogar, Jennifer
    2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 4648 - +
  • [33] A Study of Inductive Biases for Unsupervised Speech Representation Learning
    Boulianne, Gilles
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2781 - 2795
  • [34] Performance evaluation for spoken language of a syntactic analysis method using inductive learning
    Masatomi, Y
    Araki, K
    Tochinai, K
    2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE, 2002, : 1667 - 1672
  • [35] Sentiment Analysis Using Language Models: A Study
    Kumawat, Spraha
    Yadav, Inna
    Pahal, Nisha
    Goel, Deepti
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 984 - 988
  • [36] Statistical Learning Constrained by Syntactic Biases in an Artificial Language Learning Task
    Culbertson, Jennifer
    Smolensky, Paul
    Legendre, Geraldine
    PROCEEDINGS OF THE 36TH ANNUAL BOSTON UNIVERSITY CONFERENCE ON LANGUAGE DEVELOPMENT, VOLS 1 AND 2, 2012, : 139 - +
  • [37] Statistically Profiling Biases in Natural Language Reasoning Datasets and Models
    Huang, Shanshan
    Zhu, Kenny Q.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4521 - 4530
  • [38] Performance and biases of Large Language Models in public opinion simulation
    Qu, Yao
    Wang, Jue
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2024, 11 (01):
  • [39] Artificial Intelligence in mental health and the biases of language based models
    Straw, Isabel
    Callison-Burch, Chris
    PLOS ONE, 2020, 15 (12):
  • [40] Unmasking the Mask - Evaluating Social Biases in Masked Language Models
    Kaneko, Masahiro
    Bollegala, Danushka
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11954 - 11962