Topic Structure-Aware Neural Language Model: Unified language model that maintains word and topic ordering by their embedded representations

被引:4
|
作者
Kawamae, Noriaki [1 ]
机构
[1] NTT COMWARE, Tokyo, Japan
关键词
Word embeddings; Topic Models; Recurrent Neural Network; Neural Language Model;
D O I
10.1145/3308558.3313757
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Our goal is to exploit a unified language model so as to explain the generative process of documents precisely in view of their semantic and topic structures. Because various methods model documents in disparate ways, we are motivated by the expectation that coordinating these methods will allow us to achieve this goal more efficiently than using them in isolation; we combine topic models, embedding models, and neural language models. As we focus on the fact that topic models can be shared among, and indeed complement embedding models and neural language models, we propose Word and topic 2 vec (Wat2vec), and Topic Structure-Aware Neural Language Model (TSANL). Wat2vec uses topics as global semantic information and local semantic information as embedding representations of topics and words, and embeds both words and topics in the same space. TSANL uses recurrent neural networks to capture long-range dependencies over topics and words. Since existing topic models demand time consuming learning and have poor scalability, both due to breaking the document's structure such as order of words and topics, TSANL maintains the orders of words and topics as phrases and segments, respectively. TSANL reduces the calculation cost and required memory by feeding topic recurrent neural networks, and topic specific word networks with these embedding representations. Experiments show that TSANL maintains both segments and topical phrases, and so enhances previous models.
引用
收藏
页码:2900 / 2906
页数:7
相关论文
共 50 条
  • [1] Topic Compositional Neural Language Model
    Wang, Wenlin
    Gan, Zhe
    Wang, Wenqi
    Shen, Dinghan
    Huang, Jiaji
    Ping, Wei
    Satheesh, Sanjeev
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [2] Tipster: A Topic-Guided Language Model for Topic-Aware Text Segmentation
    Gong, Zheng
    Tong, Shiwei
    Wu, Han
    Liu, Qi
    Tao, Hanqing
    Huang, Wei
    Yu, Runlong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 213 - 221
  • [3] Dependency structure language model for topic detection and tracking
    Lee, Changki
    Lee, Gary Geunbae
    Jang, Myunggil
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (05) : 1249 - 1259
  • [4] Identifying informative tweets during a pandemic via a topic-aware neural language model
    Wang Gao
    Lin Li
    Xiaohui Tao
    Jing Zhou
    Jun Tao
    World Wide Web, 2023, 26 : 55 - 70
  • [5] Identifying informative tweets during a pandemic via a topic-aware neural language model
    Gao, Wang
    Li, Lin
    Tao, Xiaohui
    Zhou, Jing
    Tao, Jun
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (01): : 55 - 70
  • [6] UniSAr: a unified structure-aware autoregressive language model for text-to-SQL semantic parsing
    Longxu Dou
    Yan Gao
    Mingyang Pan
    Dingzirui Wang
    Wanxiang Che
    Jian-Guang Lou
    Dechen Zhan
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 4361 - 4376
  • [7] UniSAr: a unified structure-aware autoregressive language model for text-to-SQL semantic parsing
    Dou, Longxu
    Gao, Yan
    Pan, Mingyang
    Wang, Dingzirui
    Che, Wanxiang
    Lou, Jian-Guang
    Zhan, Dechen
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (12) : 4361 - 4376
  • [8] Retrofitting Structure-aware Transformer Language Model for End Tasks
    Fei, Hao
    Ren, Yafeng
    Ji, Donghong
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2151 - 2161
  • [9] Explainable and Discourse Topic-aware Neural Language Understanding
    Chaudhary, Yatin
    Schutze, Hinrich
    Gupta, Pankaj
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [10] Explainable and Discourse Topic-aware Neural Language Understanding
    Chaudhary, Yatin
    Schutze, Hinrich
    Gupta, Pankaj
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119