A Comparative Study of Fuzzy Topic Models and LDA in terms of Interpretability

被引：9

作者：

Rijcken, Emil ^{[1
]}

Scheepers, Floortje ^{[2
]}

Mosteiro, Pablo ^{[3
]}

Zervanou, Kalliopi ^{[4
]}

Spruit, Marco ^{[5
]}

Kaymak, Uzay ^{[1
]}

机构：

[1] Eindhoven Univ Technol, Jheronimus Acad Data Sci, Eindhoven, Netherlands

[2] Univ Med Ctr Utrecht, Psychiat, Utrecht, Netherlands

[3] Univ Med Ctr Utrecht, Informat & Comp Sci, Utrecht, Netherlands

[4] Eindhoven Univ Technol, Ind Engn & Informat Sci, Eindhoven, Netherlands

[5] Leiden Univ, Med Ctr, Publ Hlth & Primary Care, Leiden, Netherlands

来源：

2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年

关键词：

Topic Models; Text Classification; Fuzzy Modelling; Explainable AI; NLP; CLASSIFICATION; TEXT;

D O I：

10.1109/SSCI50451.2021.9660139

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many domains that employ machine learning models, both high performing and interpretable models are needed. A typical machine learning task is text classification, where models are hardly interpretable. Topic models, used as topic embeddings, carry the potential to better understand the decisions made by text classification algorithms. With this goal in mind, we propose two new fuzzy topic models; FLSA-W and FLSA-V. Both models are derived from the topic model Fuzzy Latent Semantic Analysis (FLSA). After training each model ten times, we use the mean coherence score to compare the different models with the benchmark models Latent Dirichlet Allocation (LDA) and FLSA. Our proposed models generally lead to higher coherence scores and lower standard deviations than the benchmark models. These proposed models are specifically useful as topic embeddings in text classification, since the coherence scores do not drop for a high number of topics, as opposed to the decay that occurs with LDA and FLSA.

引用

页数：8

共 50 条

[1] INTERPRETABILITY IN TERMS OF MODELS
MONTAGUE, R
KONINKLIJKE NEDERLANDSE AKADEMIE VAN WETESCHAPPEN-PROCEEDINGS SERIES A-MATHEMATICAL SCIENCES, 1965, 68 (03): : 467 - &
[2] Clustering with Probabilistic Topic Models on Arabic Texts: A Comparative Study of LDA and K-Means
Kelaiaia, Abdessalem
Merouani, Hayet
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2016, 13 (02) : 332 - 338
[3] Interpretability-preserving genetic optimization of linguistic terms in fuzzy models for fuzzy ordered classification: An ecological case study
Van Broekhoven, Ester
Adriaenssens, Veronique
De Baets, Bernard
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2007, 44 (01) : 65 - 90
[4] Topic research in fuzzy domain: Based on LDA topic modelling
Yu, Dejian
Fang, Anran
Xu, Zeshui
INFORMATION SCIENCES, 2023, 648
[5] Interpretability of Fuzzy Temporal Models
Shabelnikov, Alexander N.
Kovalev, Sergey M.
Sukhanov, Andrey V.
PROCEEDINGS OF THE THIRD INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'18), VOL 1, 2019, 874 : 223 - 234
[6] Gaussian LDA for Topic Models with Word Embeddings
Das, Rajarshi
Zaheer, Manzil
Dyer, Chris
PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 795 - 804
[7] Topic Significance Ranking of LDA Generative Models
AlSumait, Loulwah
Barbara, Daniel
Gentle, James
Domeniconi, Carlotta
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 67 - +
[8] The LDA Topic Model Extension Study
Yang, Qingquan
Li, Weijiang
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 857 - 860
[9] Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA
Lu, Yue
Mei, Qiaozhu
Zhai, ChengXiang
INFORMATION RETRIEVAL, 2011, 14 (02): : 178 - 203
[10] Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA
Yue Lu
Qiaozhu Mei
ChengXiang Zhai
Information Retrieval, 2011, 14 : 178 - 203

← 1 2 3 4 5 →