A Generalized Language Model in Tensor Space

被引:0
|
作者
Zhang, Lipeng [1 ]
Zhang, Peng [1 ]
Ma, Xindian [1 ]
Gu, Shuqin [1 ]
Su, Zhan [1 ]
Song, Dawei [2 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.
引用
收藏
页码:7450 / 7458
页数:9
相关论文
共 50 条