Topic Structure-Aware Neural Language Model: Unified language model that maintains word and topic ordering by their embedded representations

被引:4
|
作者
Kawamae, Noriaki [1 ]
机构
[1] NTT COMWARE, Tokyo, Japan
关键词
Word embeddings; Topic Models; Recurrent Neural Network; Neural Language Model;
D O I
10.1145/3308558.3313757
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Our goal is to exploit a unified language model so as to explain the generative process of documents precisely in view of their semantic and topic structures. Because various methods model documents in disparate ways, we are motivated by the expectation that coordinating these methods will allow us to achieve this goal more efficiently than using them in isolation; we combine topic models, embedding models, and neural language models. As we focus on the fact that topic models can be shared among, and indeed complement embedding models and neural language models, we propose Word and topic 2 vec (Wat2vec), and Topic Structure-Aware Neural Language Model (TSANL). Wat2vec uses topics as global semantic information and local semantic information as embedding representations of topics and words, and embeds both words and topics in the same space. TSANL uses recurrent neural networks to capture long-range dependencies over topics and words. Since existing topic models demand time consuming learning and have poor scalability, both due to breaking the document's structure such as order of words and topics, TSANL maintains the orders of words and topics as phrases and segments, respectively. TSANL reduces the calculation cost and required memory by feeding topic recurrent neural networks, and topic specific word networks with these embedding representations. Experiments show that TSANL maintains both segments and topical phrases, and so enhances previous models.
引用
收藏
页码:2900 / 2906
页数:7
相关论文
共 50 条
  • [41] TopicBERT: A Topic-Enhanced Neural Language Model Fine-Tuned for Sentiment Classification
    Zhou, Yuxiang
    Liao, Lejian
    Gao, Yang
    Wang, Rui
    Huang, Heyan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 380 - 393
  • [42] A neural topic model with word vectors and entity vectors for short texts
    Zhao, Xiaowei
    Wang, Deqing
    Zhao, Zhengyang
    Liu, Wei
    Lu, Chenwei
    Zhuang, Fuzhen
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (02)
  • [43] DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model
    Fang, Yitian
    Jiang, Yi
    Wei, Leyi
    Ma, Qin
    Ren, Zhixiang
    Yuan, Qianmu
    Wei, Dong-Qing
    BIOINFORMATICS, 2023, 39 (12)
  • [44] Language model switching based on topic detection for dialog speech recognition
    Lane, IR
    Kawahara, T
    Matsui, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 616 - 619
  • [45] Language Model-Driven Topic Clustering and Summarization for News Articles
    Yang, Peng
    Li, Wenhan
    Zhao, Guangzhen
    IEEE ACCESS, 2019, 7 : 185506 - 185519
  • [46] An unsupervised Web-based topic language model adaptation method
    Lecorve, Gwenole
    Gravier, Guillaume
    Sebillot, Pascale
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5081 - 5084
  • [47] Topic Detection based on Deep Learning Language Model in Turkish Microblogs
    Sahinuc, Furkan
    Toraman, Cagri
    Koc, Aykut
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [48] Topic independent language model for key-phrase detection and verification
    Kawahara, Tatsuya
    Doshita, Shuji
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 2 : 685 - 688
  • [49] Automatic Language Identification using Suprasegmental Feature and Supervised Topic Model
    Sun, Linjia
    SSPS 2020: 2020 2ND SYMPOSIUM ON SIGNAL PROCESSING SYSTEMS, 2020, : 69 - 73
  • [50] A topic-driven language model for learning to generate diverse sentences
    Gao, Ce
    Ren, Jiangtao
    NEUROCOMPUTING, 2019, 333 : 374 - 380