Hybrid Text Embedding and Evolutionary Algorithm Approach for Topic Clustering in Online Discussion Forums

被引:1
|
作者
Bouabdallaoui, Ibrahim [1 ]
Guerouate, Fatima [1 ]
Sbihi, Mohammed [1 ]
机构
[1] Mohammed V Univ Rabat, LASTIMI Lab EST Sale, Ave Prince Heritier, Sale, Morocco
关键词
LDA; BERT; K-Means; Genetic Algorithms; Forum Analysis;
D O I
10.14201/adcaij.31448
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging discussion forums as a medium for information exchange has led to a surge in data, making topic clustering in these platforms essential for understanding user interests, preferences, and concerns. This study introduces an innovative methodology for topic clustering by combining text embedding techniques-Latent Dirichlet Allocation (LDA) and BERT-trained on a singular autoencoder. Additionally, it proposes an amalgamation of K-Means and Genetic Algorithms for clustering topics within triadic discussion forum threads. The proposed technique begins with a preprocessing stage to clean and tokenize textual data, which is then transformed into a vector representation using the hybrid text embedding method. Subsequently, the K-Means algorithm clusters these vectorized data points, and Genetic Algorithms optimize the parameters of the K-Means clustering. We assess the efficacy of our approach by computing cosine similarities between topics and comparing performance against coherence and graph visualization. The results confirm that the hybrid text embedding methodology, coupled with evolutionary algorithms, enhances the quality of topic clustering across various discussion forum themes. This investigation contributes significantly to the development of effective methods for clustering discussion forums, with potential applications in diverse domains, including social media analysis, online education, and customer response analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] CLUSTERING OF THREAD POSTS IN ONLINE DISCUSSION FORUMS
    Said, Dina
    Wanas, Nayer
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 314 - 319
  • [2] Sentiment Distribution of Topic Discussion in Online English Learning: An Approach Based on Clustering Algorithm and Improved CNN
    Yang, Qiujuan
    Zhang, Jiaxiao
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2023, 16 (02)
  • [3] Automatic Classification for Cognitive Engagement in Online Discussion Forums: Text Mining and Machine Learning Approach
    Hayati, Hind
    Idrissi, Mohammed Khalidi
    Bennani, Samir
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 114 - 118
  • [4] Graph and Embedding based Approach for Text Clustering: Topic Detection in a Large Multilingual Public Consultation
    Stefanovitch, Nicolas
    Jacquet, Guillaume
    de Longueville, Bertrand
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 694 - 700
  • [5] SBTM: A joint sentiment and behaviour topic model for online course discussion forums
    Peng, Xian
    Xu, Qinmei
    Gan, Wenbin
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (04) : 517 - 532
  • [6] Short Text Embedding for Clustering based on Word and Topic Semantic Information
    Chen, Ziheng
    Ren, Jiangtao
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 61 - 70
  • [7] A hybrid approach for text clustering
    Ajmi Al-Shuwaili S.O.
    Obied Redywi S.
    Naser M.A.
    Materials Today: Proceedings, 2023, 80 : 2584 - 2589
  • [8] A hybrid evolutionary computation approach with its application for optimizing text document clustering
    Song, Wei
    Qiao, Yingying
    Park, Soon Cheol
    Qian, Xuezhong
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2517 - 2524
  • [9] A hybrid approach for text document clustering using Jaya optimization algorithm
    Thirumoorthy, Karpagalingam
    Muneeswaran, Karuppaiah
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 178
  • [10] Ontology-based Topic Clustering for Online Discussion Data
    Wang, Yongheng
    Cao, Kening
    Zhang, Xiaoming
    INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768