Topic Structure-Aware Neural Language Model: Unified language model that maintains word and topic ordering by their embedded representations

被引：4

作者：

Kawamae, Noriaki ^{[1
]}

机构：

[1] NTT COMWARE, Tokyo, Japan

来源：

WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019) | 2019年

关键词：

Word embeddings; Topic Models; Recurrent Neural Network; Neural Language Model;

D O I：

10.1145/3308558.3313757

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Our goal is to exploit a unified language model so as to explain the generative process of documents precisely in view of their semantic and topic structures. Because various methods model documents in disparate ways, we are motivated by the expectation that coordinating these methods will allow us to achieve this goal more efficiently than using them in isolation; we combine topic models, embedding models, and neural language models. As we focus on the fact that topic models can be shared among, and indeed complement embedding models and neural language models, we propose Word and topic 2 vec (Wat2vec), and Topic Structure-Aware Neural Language Model (TSANL). Wat2vec uses topics as global semantic information and local semantic information as embedding representations of topics and words, and embeds both words and topics in the same space. TSANL uses recurrent neural networks to capture long-range dependencies over topics and words. Since existing topic models demand time consuming learning and have poor scalability, both due to breaking the document's structure such as order of words and topics, TSANL maintains the orders of words and topics as phrases and segments, respectively. TSANL reduces the calculation cost and required memory by feeding topic recurrent neural networks, and topic specific word networks with these embedding representations. Experiments show that TSANL maintains both segments and topical phrases, and so enhances previous models.

引用

页码：2900 / 2906

页数：7

共 50 条

[31] Add temporal information to dependency structure language model for topic detection and tracking
Qiu, Jing
Liao, Le-Jian
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 1575 - 1580
[32] Topic aware probing: From sentence length prediction to idiom identification how reliant are neural language models on topic?
Nedumpozhimana, Vasudevan
Kelleher, John D.
NATURAL LANGUAGE PROCESSING, 2024,
[33] LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model
Fei, Hao
Wu, Shengqiong
Li, Jingye
Li, Bobo
Li, Fei
Qin, Libo
Zhang, Meishan
Zhang, Min
Chua, Tat-Seng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[34] SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval
Li, Haitao
Ai, Qingyao
Chen, Jia
Dong, Qian
Wu, Yueyue
Liu, Yiqun
Chen, Chong
Tian, Qi
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1035 - 1044
[35] Revisiting Automated Topic Model Evaluation with Large Language Models
Stammbach, Dominik
Zouhar, Vilem
Hoyle, Alexander
Sachan, Mrinmaya
Ash, Elliott
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9348 - 9357
[36] Robust topic inference for latent semantic language model adaptation
Heidel, Aaron
Lee, Lin-shan
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 177 - 182
[37] Language model adaptation through topic decomposition and MDI estimation
Federico, M
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 773 - 776
[38] SEMI-SUPERVISED LEARNING OF LANGUAGE MODEL USING UNSUPERVISED TOPIC MODEL
Bai, Shuanhu
Huang, Chien-Lin
Ma, Bin
Li, Haizhou
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5382 - 5385
[39] Unified Neural Topic Model via Contrastive Learning and Term Weighting
Han, Sungwon
Shin, Mingi
Park, Sungkyu
Jung, Changwook
Cha, Meeyoung
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1802 - 1817
[40] A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model
Um, Taehum
Kim, Namhyoung
APPLIED SCIENCES-BASEL, 2024, 14 (17):

← 1 2 3 4 5 →