Quantity doesn't buy quality syntax with neural language models

被引：0

作者：

van Schijndel, Marten ^{[1
]}

Mueller, Aaron ^{[2
]}

Linzen, Tal ^{[2
]}

机构：

[1] Cornell Univ, Ithaca, NY 14853 USA

[2] Johns Hopkins Univ, Baltimore, MD 21218 USA

来源：

2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent neural networks can learn to predict upcoming words remarkably well on average; in syntactically complex contexts, however, they often assign unexpectedly high probabilities to ungrammatical words. We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained. We find that gains from increasing network size are minimal beyond a certain point. Likewise, expanding the training corpus yields diminishing returns; we estimate that the training corpus would need to be unrealistically large for the models to match human performance. A comparison to GPT and BERT, Transformer-based models trained on billions of words, reveals that these models perform even more poorly than our LSTMs in some constructions. Our results make the case for more data efficient architectures.

引用

页码：5831 / 5837

页数：7

共 50 条

[41] PARAPHRASTIC LANGUAGE MODELS AND COMBINATION WITH NEURAL NETWORK LANGUAGE MODELS
Liu, X.
Gales, M. J. F.
Woodland, P. C.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8421 - 8425
[42] Inferential causal models: Integrating quality & quantity
Smith, RB
QUALITY & QUANTITY, 2003, 37 (04) : 337 - 361
[43] INCOME EFFECTS IN QUALITY-QUANTITY MODELS
BORJAS, GJ
ECONOMICS LETTERS, 1979, 3 (02) : 125 - 131
[44] Inferential Causal Models: Integrating Quality & Quantity
Robert B. Smith
Quality and Quantity, 2003, 37 : 337 - 361
[45] Second Language Acquisition of Neural Language Models
Oba, Miyu
Kuribayashi, Tatsuki
Ouchi, Hiroki
Watanabe, Taro
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13557 - 13572
[46] Multimodal Neural Language Models
Kiros, Ryan
Salakhutdinov, Ruslan
Zemel, Richard
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 595 - 603
[47] Reading and Language Disorders: The Importance of Both Quantity and Quality
Newbury, Dianne F.
Monaco, Anthony P.
Paracchini, Silvia
GENES, 2014, 5 (02) : 285 - 309
[48] Czech Information Retrieval with Syntax-based Language Models
Strakova, Jana
Pecina, Pavel
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
[49] Dynamic Neural Language Models
Delasalles, Edouard
Lamprier, Sylvain
Denoyer, Ludovic
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 282 - 294
[50] Considering experimental and observational evidence of priming together, syntax doesn't look so autonomous
Lester, Nicholas A.
Du Bois, John W.
Gries, Stefan Th.
Martin, Fermin Moscoso del Prado
BEHAVIORAL AND BRAIN SCIENCES, 2017, 40

← 1 2 3 4 5 →