Efficient training of large neural networks for language modeling

被引：0

作者：

Schwenk, H ^{[1
]}

机构：

[1] CNRS, LIMSI, F-91403 Orsay, France

来源：

2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS | 2004年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently there has been increasing interest in using neural networks for language modeling. In contrast to the well known backoff n-gram language models, the neural network approach tries to limit the data sparseness problem by performing the estimation in a continuous space, allowing by this means smooth interpolations. The complexity to train such a model and to calculate one n-gram probability is however several orders of magnitude higher than for the backoff models, making the new approach difficult to use in real applications. In this paper several techniques are presented that allow the use of a neural network language model in a large vocabulary speech recognition system, in particular very fast lattice rescoring and efficient training of large neural networks on training corpora of over 10 million words. The described approach achieves significant word error reductions with respect to a carefully tuned 4-gram backoff language model in a state of the art conversational speech recognizer for the DARPA rich transcriptions evaluations.

引用

页码：3059 / 3064

页数：6

共 50 条

[41] ETC: Efficient Training of Temporal Graph Neural Networks over Large-scale Dynamic Graphs
Gao, Shihong
Li, Yiming
Shen, Yanyan
Shao, Yingxia
Chen, Lei
PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (05): : 1060 - 1072
[42] An efficient and accurate solver for large, sparse neural networks
Roman M Stolyarov
Andrea K Barreiro
Scott Norris
BMC Neuroscience, 16 (Suppl 1)
[43] Artificial Neural Networks for efficient RF MEMS Modeling
Vietzorreck, L.
Milijic, M.
Marinkovi, Z.
Kim, T.
Markovic, V.
Pronic-Rancic, O.
2014 XXXITH URSI GENERAL ASSEMBLY AND SCIENTIFIC SYMPOSIUM (URSI GASS), 2014,
[44] EpsiloNN - A specification language for the efficient parallel simulation of neural networks
Strey, A
BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 714 - 722
[45] Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
Tay, Yi
Zhang, Aston
Tuan, Luu Anh
Rao, Jinfeng
Zhang, Shuai
Wang, Shuohang
Fu, Jie
Hui, Siu Cheung
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1494 - 1503
[46] ByteGNN: Efficient Graph Neural Network Training at Large Scale
Zheng, Chenguang
Chen, Hongzhi
Cheng, Yuxuan
Song, Zhezheng
Wu, Yifan
Li, Changji
Cheng, James
Yang, Hao
Zhang, Shuai
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (06): : 1228 - 1242
[47] Efficient Training of Convolutional Neural Nets on Large Distributed Systems
Sreedhar, Dheeraj
Saxena, Vaibhav
Sabharwal, Yogish
Verma, Ashish
Kumar, Sameer
2018 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2018, : 392 - 401
[48] Investigation of Large-Margin Softmax in Neural Language Modeling
Huo, Jingjing
Gao, Yingbo
Wang, Weiyue
Schlueter, Ralf
Ney, Hermann
INTERSPEECH 2020, 2020, : 3645 - 3649
[49] Memory-Efficient Training of Binarized Neural Networks on the Edge
Yayla, Mikail
Chen, Jian-Jia
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 661 - 666
[50] Design and Training of Binarized Neural Networks for Highly Efficient Accelerators
Li J.
Xu H.
Wang Y.
Xiao H.
Wang Y.
Han Y.
Li X.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (06): : 961 - 969

← 1 2 3 4 5 →