A Generalized Language Model in Tensor Space

被引:0
|
作者
Zhang, Lipeng [1 ]
Zhang, Peng [1 ]
Ma, Xindian [1 ]
Gu, Shuqin [1 ]
Su, Zhan [1 ]
Song, Dawei [2 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.
引用
收藏
页码:7450 / 7458
页数:9
相关论文
共 50 条
  • [41] Charged black holes in a generalized scalar-tensor gravity model
    Brihaye, Yves
    Hartmann, Betti
    [J]. PHYSICS LETTERS B, 2017, 772 : 476 - 482
  • [42] Pion Tensor Generalized Parton Distributions in a Covariant Constituent Quark Model
    Emanuele Pace
    Giovanni Romanelli
    Giovanni Salmè
    [J]. Few-Body Systems, 2012, 52 : 301 - 306
  • [43] Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning
    Zhai, Yi
    Yang, Sijia
    Pan, Keyu
    Zhang, Renwei
    Liu, Shuo
    Liu, Chao
    Ye, Zichun
    Ji, Jianmin
    Zhao, Jie
    Zhang, Yu
    Zhang, Yanyong
    [J]. PROCEEDINGS OF THE 18TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2024, 2024, : 289 - 305
  • [44] USE OF CUDA FOR THE CONTINUOUS SPACE LANGUAGE MODEL
    Thompson, Elizabeth A.
    Anderson, Timothy
    [J]. 2012 IEEE CONFERENCE ON HIGH PERFORMANCE EXTREME COMPUTING (HPEC), 2012,
  • [45] A CUDA implementation of the Continuous Space Language Model
    Thompson, Elizabeth A.
    Anderson, Timothy R.
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 68 (01): : 65 - 86
  • [46] A CUDA implementation of the Continuous Space Language Model
    Elizabeth A. Thompson
    Timothy R. Anderson
    [J]. The Journal of Supercomputing, 2014, 68 : 65 - 86
  • [47] Generalized synchronization model and description language for multimedia systems
    Li, ML
    Sun, YQ
    Sheng, HY
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, 1996, : 394 - 397
  • [48] Large Language Model Enhanced Logic Tensor Network for Stance Detection
    Dai, Genan
    Liao, Jiayu
    Zhao, Sicheng
    Fu, Xianghua
    Peng, Xiaojiang
    Huang, Hu
    Zhang, Bowen
    [J]. Neural Networks, 2025, 183
  • [49] GENERALIZED TENSOR COMPRESSIVE SENSING
    Li, Qun
    Schonfeld, Dan
    Friedland, Shmuel
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [50] Generalized reversible susceptibility tensor
    Dumitru, I
    Stancu, A
    Cimpoesu, D
    Spinu, L
    [J]. JOURNAL OF APPLIED PHYSICS, 2005, 97 (10)