Comparative Analysis of Community Detection and Transformer-Based Approaches for Topic Clustering of Scientific Papers

被引:1
|
作者
Bretsko, Daniel [1 ]
Belyi, Alexander [1 ]
Sobolevsky, Stanislav [1 ,2 ,3 ]
机构
[1] Masaryk Univ, Dept Math & Stat, Fac Sci, Kotlarska 2, Brno 61137, Czech Republic
[2] New York Univ, Ctr Urban Sci & Progress, 370 Jay St, Brooklyn, NY 11201 USA
[3] Masaryk Univ, Inst Law & Technol, Fac Law, Veveri 70, Brno 61180, Czech Republic
关键词
Network analysis; NLP; Topic clustering; Community detection; Sentence-transformers;
D O I
10.1007/978-3-031-36805-9_42
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We are solving the topic clustering problem, where we need to categorize papers with initially available subjects into more consistent and higher-level topics. We approach the task from two perspectives, one is the traditional network science, where we perform community detection on a subject network with the use of Combo algorithm, and the second is the transformer-based top2vec algorithm which uses sentence-transformer to embed the content of the papers. The comparison between the two approaches was conducted using a dataset of scientific papers on computer science and mathematics collected from the SCOPUS database, and different coherence scores were used as a measure of performance. The results showed that the community detection Combo algorithm was able to achieve a similar coherence score to the transformer-based top2vec. The findings suggest that community detection may be a viable alternative for topic clustering when one has pre-defined topics, especially when a high coherence score and fast processing time are desired. The paper also discusses the potential advantages and limitations of using Combo for topic clustering and the potential for future work in this area.
引用
收藏
页码:648 / 660
页数:13
相关论文
共 50 条
  • [1] Transformer-based highlights extraction from scientific papers
    La Quatra, Moreno
    Cagliero, Luca
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 252
  • [2] Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches
    Al-Yahya, Maha
    Al-Khalifa, Hend
    Al-Baity, Heyam
    AlSaeed, Duaa
    Essam, Amr
    [J]. COMPLEXITY, 2021, 2021
  • [3] TRANSFORMER-BASED HIERARCHICAL CLUSTERING FOR BRAIN NETWORK ANALYSIS
    Dai, Wei
    Cui, Hejie
    Kan, Xuan
    Guo, Ying
    Van Rooij, Sanne
    Yang, Carl
    [J]. 2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [4] Transformer-based Approaches for Personality Detection using the MBTI Model
    Lazo Vasquez, Ricardo
    Ochoa-Luna, Jose
    [J]. 2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
  • [5] Tweets Topic Classification and Sentiment Analysis Based on Transformer-Based Language Models
    Mandal, Ranju
    Chen, Jinyan
    Becken, Susanne
    Stantic, Bela
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (02) : 117 - 134
  • [6] On the Use of Transformer-Based Models for Intent Detection Using Clustering Algorithms
    Moura, Andre
    Lima, Pedro
    Mendonca, Fabio
    Mostafa, Sheikh Shanawaz
    Morgado-Dias, Fernando
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (08):
  • [7] Transformer-based Dynamic Fusion Clustering Network
    Zhang, Chunchun
    Zhao, Yaliang
    Wang, Jinke
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [8] Clustering- and Transformer-Based Networks for the Style Analysis of Logo Images
    Tian, Nannan
    Liu, Yuan
    Sun, Ziruo
    Liu, Xingbo
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [9] A Review of Transformer-Based Approaches for Image Captioning
    Ondeng, Oscar
    Ouma, Heywood
    Akuon, Peter
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [10] Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection
    Attanasio, Giuseppe
    Nozza, Debora
    Pastor, Eliana
    Hovy, Dirk
    [J]. PROCEEDINGS OF THE FIRST WORKSHOP ON EFFICIENT BENCHMARKING IN NLP (NLP POWER 2022), 2022, : 100 - 112