Effective Batching for Recurrent Neural Network Grammars

被引：0

作者：

Noji, Hiroshi ^{[1
]}

Oseki, Yohei ^{[2
]}

机构：

[1] AIST, Artificial Intelligence Res Ctr, Tokyo, Japan

[2] Univ Tokyo, Grad Sch Arts & Sci, Tokyo, Japan

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021 | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a language model that integrates traditional symbolic operations and flexible neural representations, recurrent neural network grammars (RNNGs) have attracted great attention from both scientific and engineering perspectives. However, RNNGs are known to be harder to scale due to the difficulty of batched training. In this paper, we propose effective batching for RNNGs, where every operation is computed in parallel with tensors across multiple sentences. Our PyTorch implementation effectively employs a GPU and achieves x6 speedup compared to the existing C++ DyNet implementation with model-independent auto-batching. Moreover, our batched RNNG also accelerates inference and achieves x20-150 speedup for beam search depending on beam sizes. Finally, we evaluate syntactic generalization performance of the scaled RNNG against the LSTM baseline, based on the large training data of 100M tokens from English Wikipedia and the broad-coverage targeted syntactic evaluation benchmark.(1)

引用

页码：4340 / 4352

页数：13

共 50 条

[31] The attractor recurrent neural network based on fuzzy functions: An effective model for the classification of lung abnormalities
Khodabakhshi, Mohammad Bagher
Moradi, Mohammad Hassan
COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 84 : 124 - 136
[32] LSTM-based recurrent neural network provides effective short term flu forecasting
Amendolara, Alfred B.
Sant, David
Rotstein, Horacio G.
Fortune, Eric
BMC PUBLIC HEALTH, 2023, 23 (01)
[33] LSTM-based recurrent neural network provides effective short term flu forecasting
Alfred B. Amendolara
David Sant
Horacio G. Rotstein
Eric Fortune
BMC Public Health, 23
[34] Repeated sequential learning increases memory capacity via effective decorrelation in a recurrent neural network
Kurikawa, Tomoki
Barak, Omri
Kaneko, Kunihiko
PHYSICAL REVIEW RESEARCH, 2020, 2 (02):
[35] Effective Quantization Approaches for Recurrent Neural Networks
Alom, Md Zahangir
Moody, Adam T.
Maruyama, Naoya
Van Essen, Brian C.
Taha, Tarek M.
2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
[36] Stochastic graph recurrent neural network
Yan, Tijin
Zhang, Hongwei
Li, Zirui
Xia, Yuanqing
NEUROCOMPUTING, 2022, 500 : 1003 - 1015
[37] Kp forecasting with a recurrent neural network
Sexton, Ernest Scott
Nykyri, Katariina
Ma, Xuanye
JOURNAL OF SPACE WEATHER AND SPACE CLIMATE, 2019, 9
[38] MODEL OF A NEURAL NETWORK WITH RECURRENT INHIBITION
WIGSTROM, H
KYBERNETIK, 1974, 16 (02): : 103 - 112
[39] A recurrent neural network that learns to count
Rodriguez, P
Wiles, J
Elman, JL
CONNECTION SCIENCE, 1999, 11 (01) : 5 - 40
[40] Edge detection with a recurrent neural network
Vrabel, MJ
APPLICATIONS AND SCIENCE OF ARTIFICIAL NEURAL NETWORKS II, 1996, 2760 : 365 - 371

← 1 2 3 4 5 →