Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums

被引:1
|
作者
Bouabdallaoui, Ibrahim [1 ]
Guerouate, Fatima [1 ]
Sbihi, Mohammed [1 ]
机构
[1] Mohammed V Univ Rabat, LASTIMI Lab EST Sale, Ave Prince Heritier, Sale, Morocco
关键词
LDA; BERT; K-Means; Genetic Algorithms; Forum Analysis;
D O I
10.14201/adcaij.31448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging discussion forums as a medium for information exchange has led to a surge in data, making topic clustering in these platforms essential for understanding user interests, preferences, and concerns. This study introduces an innovative methodology for topic clustering by combining text embedding techniques-Latent Dirichlet Allocation (LDA) and BERT-trained on a singular autoencoder. Additionally, it proposes an amalgamation of K-Means and Genetic Algorithms for clustering topics within triadic discussion forum threads. The proposed technique begins with a preprocessing stage to clean and tokenize textual data, which is then transformed into a vector representation using the hybrid text embedding method. Subsequently, the K-Means algorithm clusters these vectorized data points, and Genetic Algorithms optimize the parameters of the K-Means clustering. We assess the efficacy of our approach by computing cosine similarities between topics and comparing performance against coherence and graph visualization. The results confirm that the hybrid text embedding methodology, coupled with evolutionary algorithms, enhances the quality of topic clustering across various discussion forum themes. This investigation contributes significantly to the development of effective methods for clustering discussion forums, with potential applications in diverse domains, including social media analysis, online education, and customer response analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] A Text Hybrid Clustering Algorithm Based on HowNet Semantics
    Zhu, Zheng-yu
    Dong, Shu-jia
    Yu, Chun-lei
    He, Jie
    ADVANCED MATERIALS AND COMPUTER SCIENCE, PTS 1-3, 2011, 474-476 : 2071 - 2078
  • [32] An ensemble clustering approach for topic discovery using implicit text segmentation
    Memon, Muhammad Qasim
    Lu, Yu
    Chen, Penghe
    Memon, Aasma
    Pathan, Muhammad Salman
    Zardari, Zulfiqar Ali
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (04) : 431 - 457
  • [33] Hybrid evolutionary algorithm for the Capacitated Centered Clustering Problem
    Chaves, Antonio Augusto
    Nogueira Lorena, Luiz Antonio
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5013 - 5018
  • [34] A Hybrid Differential Evolutionary Algorithm Based on the Hierarchical Clustering
    Fang, Zheng
    Yang, Ming
    Zhang, Guilin
    Guan, Jing
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 2367 - 2374
  • [35] A hybrid evolutionary algorithm based on EDAs and clustering analysis
    Cao Aizeng
    Chen Yueting
    Jun, Wei
    Li Jinping
    PROCEEDINGS OF THE 26TH CHINESE CONTROL CONFERENCE, VOL 5, 2007, : 754 - +
  • [36] A Multiobjective Hybrid Evolutionary Algorithm for Clustering in Social Networks
    Amiri, Babak
    Hossain, Liaquat
    Crawford, John
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 1445 - 1446
  • [37] Health-Related Hot Topic Detection in Online Communities Using Text Clustering
    Lu, Yingjie
    Zhang, Pengzhu
    Liu, Jingfang
    Li, Jia
    Deng, Shasha
    PLOS ONE, 2013, 8 (02):
  • [38] A Novel Hybrid Method for Clustering Text Documents using Evolutionary Optimization
    Naderi, Muhammad
    Amiri, Maryam
    2023 13th International Conference on Computer and Knowledge Engineering, ICCKE 2023, 2023, : 369 - 374
  • [39] Understanding the Complexity of Teacher Emotions From Online Forums: A Computational Text Analysis Approach
    Chen, Zixi
    Shi, Xiaolin
    Zhang, Wenwen
    Qu, Liaojian
    FRONTIERS IN PSYCHOLOGY, 2020, 11
  • [40] Sentence Embedding Based Semantic Clustering Approach for Discussion Thread Summarization
    Khan, Atif
    Shah, Qaiser
    Uddin, M. Irfan
    Ullah, Fasee
    Alharbi, Abdullah
    Alyami, Hashem
    Gul, Muhammad Adnan
    COMPLEXITY, 2020, 2020