Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums

被引:1
|
作者
Bouabdallaoui, Ibrahim [1 ]
Guerouate, Fatima [1 ]
Sbihi, Mohammed [1 ]
机构
[1] Mohammed V Univ Rabat, LASTIMI Lab EST Sale, Ave Prince Heritier, Sale, Morocco
关键词
LDA; BERT; K-Means; Genetic Algorithms; Forum Analysis;
D O I
10.14201/adcaij.31448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging discussion forums as a medium for information exchange has led to a surge in data, making topic clustering in these platforms essential for understanding user interests, preferences, and concerns. This study introduces an innovative methodology for topic clustering by combining text embedding techniques-Latent Dirichlet Allocation (LDA) and BERT-trained on a singular autoencoder. Additionally, it proposes an amalgamation of K-Means and Genetic Algorithms for clustering topics within triadic discussion forum threads. The proposed technique begins with a preprocessing stage to clean and tokenize textual data, which is then transformed into a vector representation using the hybrid text embedding method. Subsequently, the K-Means algorithm clusters these vectorized data points, and Genetic Algorithms optimize the parameters of the K-Means clustering. We assess the efficacy of our approach by computing cosine similarities between topics and comparing performance against coherence and graph visualization. The results confirm that the hybrid text embedding methodology, coupled with evolutionary algorithms, enhances the quality of topic clustering across various discussion forum themes. This investigation contributes significantly to the development of effective methods for clustering discussion forums, with potential applications in diverse domains, including social media analysis, online education, and customer response analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Tracking the dynamics of SPOC discussion forums: a temporal emotion-topic modeling approach
    Liu, Zhi
    Ruedian, Sylvio
    Yang, Chongyang
    Sun, Jianwen
    Liu, Sannyuya
    2018 SEVENTH INTERNATIONAL CONFERENCE OF EDUCATIONAL INNOVATION THROUGH TECHNOLOGY (EITT 2018), 2018, : 174 - 179
  • [22] A Network Decomposition-based Text Clustering Algorithm for Topic Detection
    Meng, Zuqiang
    Shen, Shimo
    Chen, Qiulian
    MEASUREMENT TECHNOLOGY AND ITS APPLICATION, PTS 1 AND 2, 2013, 239-240 : 1318 - 1323
  • [23] Hybrid topic modeling method based on dirichlet multinomial mixture and fuzzy match algorithm for short text clustering
    Alsmadi, Mutasem K.
    Alzaqebah, Malek
    Jawarneh, Sana
    Almarashdeh, Ibrahim
    Al-Betar, Mohammed Azmi
    Alwohaibi, Maram
    Al-Mulla, Noha A.
    Ahmed, Eman A. E.
    AL Smadi, Ahmad
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [24] A Novel Hybrid Clustering Algorithm for Topic Detection on Chinese Microblogging
    Geng, Xiao
    Zhang, Yanmei
    Jiao, Yuhang
    Mei, Yinan
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (02): : 289 - 300
  • [25] Visualizing the learning patterns of topic-based social interaction in online discussion forums: an exploratory study
    Gary K. W. Wong
    Yiu Keung Li
    Xiaoyan Lai
    Educational Technology Research and Development, 2021, 69 : 2813 - 2843
  • [26] An Evolutionary Algorithm for Feature Selective Double Clustering of Text Documents
    Nourashrafeddin, S. N.
    Milios, Evangelos
    Arnold, Dirk V.
    2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2013, : 446 - 453
  • [27] Concept Mining in Online Forums Using Self-corpus-Based Augmented Text Clustering
    Mohotti, Wathsala Anupama
    Lukas, Darren Christopher
    Nayak, Richi
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2019, 11670 : 397 - 402
  • [28] Visualizing the learning patterns of topic-based social interaction in online discussion forums: an exploratory study
    Wong, Gary K. W.
    Li, Yiu Keung
    Lai, Xiaoyan
    ETR&D-EDUCATIONAL TECHNOLOGY RESEARCH AND DEVELOPMENT, 2021, 69 (05): : 2813 - 2843
  • [29] An evolutionary clustering algorithm of the heterogeneous information network based on embedding technology
    Chen, Limin
    Yang, Jing
    Zhang, Jianpei
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2015, 36 (05): : 692 - 696
  • [30] A clustering algorithm based on elitist evolutionary approach
    Boudjeloud-Assala, Lydia
    Ta Minh Thuy
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2017, 10 (04) : 258 - 266