Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums

被引:1
|
作者
Bouabdallaoui, Ibrahim [1 ]
Guerouate, Fatima [1 ]
Sbihi, Mohammed [1 ]
机构
[1] Mohammed V Univ Rabat, LASTIMI Lab EST Sale, Ave Prince Heritier, Sale, Morocco
关键词
LDA; BERT; K-Means; Genetic Algorithms; Forum Analysis;
D O I
10.14201/adcaij.31448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging discussion forums as a medium for information exchange has led to a surge in data, making topic clustering in these platforms essential for understanding user interests, preferences, and concerns. This study introduces an innovative methodology for topic clustering by combining text embedding techniques-Latent Dirichlet Allocation (LDA) and BERT-trained on a singular autoencoder. Additionally, it proposes an amalgamation of K-Means and Genetic Algorithms for clustering topics within triadic discussion forum threads. The proposed technique begins with a preprocessing stage to clean and tokenize textual data, which is then transformed into a vector representation using the hybrid text embedding method. Subsequently, the K-Means algorithm clusters these vectorized data points, and Genetic Algorithms optimize the parameters of the K-Means clustering. We assess the efficacy of our approach by computing cosine similarities between topics and comparing performance against coherence and graph visualization. The results confirm that the hybrid text embedding methodology, coupled with evolutionary algorithms, enhances the quality of topic clustering across various discussion forum themes. This investigation contributes significantly to the development of effective methods for clustering discussion forums, with potential applications in diverse domains, including social media analysis, online education, and customer response analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Research of text clustering based on hybrid Parallel Genetic Algorithm
    Dai, Wenhua
    Rao, Guizhen
    He, Tingting
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 28 - 31
  • [42] Topic attention encoder: A self-supervised approach for short text clustering
    Jin, Jian
    Zhao, Haiyuan
    Ji, Ping
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (05) : 701 - 717
  • [43] A hybrid multiobjective evolutionary algorithm model based on local linear embedding
    Zhan, Wei
    You, Wenling
    Zhang, Ming
    INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2015, 6 (03) : 211 - 220
  • [44] A Hybrid Method for Manufacturing Text Mining Based on Document Clustering and Topic Modeling Techniques
    Shotorbani, Peyman Yazdizadeh
    Ameri, Farhad
    Kulvatunyou, Boonserm
    Ivezic, Nenad
    ADVANCES IN PRODUCTION MANAGEMENT SYSTEMS: INITIATIVES FOR A SUSTAINABLE WORLD, 2016, 488 : 777 - 786
  • [45] A hybrid approach for prolonging lifetime of wireless sensor networks using genetic algorithm and online clustering
    Rezaeipanah A.
    Nazari H.
    Ahmadi G.
    Journal of Computing Science and Engineering, 2019, 13 (04) : 163 - 174
  • [46] Research on Text Hierarchical Topic Identification Algorithm Based on the Dynamic Diverse Thresholds Clustering
    Xu Yong-Dong
    Quan Guang-Ri
    Xu Zhi-Ming
    Wang Ya-Dong
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 206 - 210
  • [47] Gibbs-BERTopic: A Hybrid Approach for Short Text Topic Modeling
    Zhu, Yan
    Liu, Yueying
    IEEE ACCESS, 2025, 13 : 49162 - 49173
  • [48] Sequential Embedding Induced Text Clustering, a Non-parametric Bayesian Approach
    Duan, Tiehang
    Lou, Qi
    Srihari, Sargur N.
    Xie, Xiaohui
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT III, 2019, 11441 : 68 - 80
  • [49] An approach for extractive text summarization using fuzzy evolutionary and clustering algorithms
    Verma, Pradeepika
    Verma, Anshul
    Pal, Sukomal
    APPLIED SOFT COMPUTING, 2022, 120
  • [50] A HYBRID EVOLUTIONARY ALGORITHM FOR EFFICIENT EXPLORATION OF ONLINE SOCIAL NETWORKS
    Stanimirovic, Zorica
    Miskovic, Stefan
    COMPUTING AND INFORMATICS, 2014, 33 (02) : 410 - 430