Semantically-Enhanced Topic Modeling

被引:10
|
作者
Viegas, Felipe [1 ]
Luiz, Washington [1 ]
Gomes, Christian [2 ]
Khatibi, Amir [1 ]
Canuto, Sergio [3 ]
Mourao, Fernando [4 ]
Salles, Thiago [1 ]
Rocha, Leonardo [2 ]
Goncalves, Marcos Andre [1 ]
机构
[1] Univ Fed Minas Gerais, Belo Horizonte, MG, Brazil
[2] Univ Fed Sao Joao del Rei, Sao Joao del Rei, Brazil
[3] IFG, Luziania, Brazil
[4] Seek AI Labs, Belo Horizonte, MG, Brazil
关键词
Topic Modeling; Word Embeddings; Bag of Words;
D O I
10.1145/3269206.3271797
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we advance the state-of-the-art in topic modeling by means of the design and development of a novel (semi-formal) general topic modeling framework. The novel contributions of our solution include: (i) the introduction of new semantically-enhanced data representations for topic modeling based on pooling, and (ii) the proposal of a novel topic extraction strategy - ASToC -that solves the difficulty in representing topics in our semantically-enhanced information space. In our extensive experimentation evaluation, covering 12 datasets and 12 state-of-the-art baselines, totalizing 108 tests, we exceed (with a few ties) in almost 100 cases, with gains of more than 50% against the best baselines (achieving up to 80% against some runner-ups). We provide qualitative and quantitative statistical analyses of why our solutions work so well. Finally, we show that our method is able to improve document representation in automatic text classification.
引用
下载
收藏
页码:893 / 902
页数:10
相关论文
共 50 条
  • [31] Combine Topic Modeling with Semantic Embedding: Embedding Enhanced Topic Model
    Zhang, Peng
    Wang, Suge
    Li, Deyu
    Li, Xiaoli
    Xu, Zhikang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (12) : 2322 - 2335
  • [32] An Enhanced Topic Modeling Approach to Multiple Stance Identification
    Lin, Junjie
    Mao, Wenji
    Zhang, Yuhao
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2167 - 2170
  • [33] Semantically Enhanced Term Frequency
    Mueller, Christof
    Gurevych, Iryna
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 598 - 601
  • [34] Semantically Enhanced Recommender Systems
    Ruiz-Montiel, Manuela
    Aldana-Montes, Jose F.
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2009 WORKSHOPS, 2009, 5872 : 604 - 609
  • [35] Semantically Enhanced Entity Ranking
    Demartini, Gianluca
    Firan, Claudiu S.
    Iofciu, Tereza
    Nejdl, Wolfgang
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2008, PROCEEDINGS, 2008, 5175 : 176 - 188
  • [36] Lifelong topic modeling with knowledge-enhanced adversarial network
    Xuewen Zhang
    Yanghui Rao
    Qing Li
    World Wide Web, 2022, 25 : 219 - 238
  • [37] Enhanced Frequent Itemsets Based on Topic Modeling in Information Filtering
    Than Than Wai
    Aung, Sint Sint
    2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 155 - 160
  • [38] Enhanced Topic Modeling with Multi-modal Representation Learning
    Zhang, Duoyi
    Wang, Yue
    Abul Bashar, Md
    Nayak, Richi
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT I, 2023, 13935 : 393 - 404
  • [39] Lifelong topic modeling with knowledge-enhanced adversarial network
    Zhang, Xuewen
    Rao, Yanghui
    Li, Qing
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (01): : 219 - 238
  • [40] An Image-Enhanced Topic Modeling Method for Neuroimaging Literature
    Ma, Lianfang
    Chen, Jianhui
    Zhong, Ning
    BRAIN INFORMATICS, BI 2021, 2021, 12960 : 299 - 309