Topic Structure-Aware Neural Language Model: Unified language model that maintains word and topic ordering by their embedded representations

被引:4
|
作者
Kawamae, Noriaki [1 ]
机构
[1] NTT COMWARE, Tokyo, Japan
关键词
Word embeddings; Topic Models; Recurrent Neural Network; Neural Language Model;
D O I
10.1145/3308558.3313757
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Our goal is to exploit a unified language model so as to explain the generative process of documents precisely in view of their semantic and topic structures. Because various methods model documents in disparate ways, we are motivated by the expectation that coordinating these methods will allow us to achieve this goal more efficiently than using them in isolation; we combine topic models, embedding models, and neural language models. As we focus on the fact that topic models can be shared among, and indeed complement embedding models and neural language models, we propose Word and topic 2 vec (Wat2vec), and Topic Structure-Aware Neural Language Model (TSANL). Wat2vec uses topics as global semantic information and local semantic information as embedding representations of topics and words, and embeds both words and topics in the same space. TSANL uses recurrent neural networks to capture long-range dependencies over topics and words. Since existing topic models demand time consuming learning and have poor scalability, both due to breaking the document's structure such as order of words and topics, TSANL maintains the orders of words and topics as phrases and segments, respectively. TSANL reduces the calculation cost and required memory by feeding topic recurrent neural networks, and topic specific word networks with these embedding representations. Experiments show that TSANL maintains both segments and topical phrases, and so enhances previous models.
引用
收藏
页码:2900 / 2906
页数:7
相关论文
共 50 条
  • [31] Add temporal information to dependency structure language model for topic detection and tracking
    Qiu, Jing
    Liao, Le-Jian
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 1575 - 1580
  • [32] Topic aware probing: From sentence length prediction to idiom identification how reliant are neural language models on topic?
    Nedumpozhimana, Vasudevan
    Kelleher, John D.
    NATURAL LANGUAGE PROCESSING, 2024,
  • [33] LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model
    Fei, Hao
    Wu, Shengqiong
    Li, Jingye
    Li, Bobo
    Li, Fei
    Qin, Libo
    Zhang, Meishan
    Zhang, Min
    Chua, Tat-Seng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
    Li, Haitao
    Ai, Qingyao
    Chen, Jia
    Dong, Qian
    Wu, Yueyue
    Liu, Yiqun
    Chen, Chong
    Tian, Qi
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1035 - 1044
  • [35] Revisiting Automated Topic Model Evaluation with Large Language Models
    Stammbach, Dominik
    Zouhar, Vilem
    Hoyle, Alexander
    Sachan, Mrinmaya
    Ash, Elliott
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9348 - 9357
  • [36] Robust topic inference for latent semantic language model adaptation
    Heidel, Aaron
    Lee, Lin-shan
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 177 - 182
  • [37] Language model adaptation through topic decomposition and MDI estimation
    Federico, M
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 773 - 776
  • [38] SEMI-SUPERVISED LEARNING OF LANGUAGE MODEL USING UNSUPERVISED TOPIC MODEL
    Bai, Shuanhu
    Huang, Chien-Lin
    Ma, Bin
    Li, Haizhou
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5382 - 5385
  • [39] Unified Neural Topic Model via Contrastive Learning and Term Weighting
    Han, Sungwon
    Shin, Mingi
    Park, Sungkyu
    Jung, Changwook
    Cha, Meeyoung
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1802 - 1817
  • [40] A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model
    Um, Taehum
    Kim, Namhyoung
    APPLIED SCIENCES-BASEL, 2024, 14 (17):