Scientific publications clustering using textual and citation information

被引:2
|
作者
Chikhi, Nacim Fateh [1 ]
机构
[1] Univ Blida 1, Fac Sci, Dept Comp Sci, BP 270 Route Soumaa, Blida 09000, Algeria
关键词
Document clustering; Text mining; Science mapping; RELATEDNESS MEASURES;
D O I
10.1016/j.eswa.2024.123319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scientific publications clustering has attracted much attention, and many different approaches have been proposed. One of the challenges in scientific documents clustering is how to combine citation and textual information to improve clustering quality. In this paper, we explore the use of the von Mises-Fisher distribution for scientific documents clustering. The von Mises-Fisher distribution is particularly well-suited for the analysis of directional data. More precisely, we propose a multi-view version of the mixture of von Mises-Fisher distributions in which one view corresponds to textual information and the other view corresponds to citation information. The hypothesis underlying our approach is that both text and citation data are directional. To estimate the parameters of the proposed model, we use the Expectation-Maximization algorithm along with deterministic annealing to escape poor local maxima solutions. Experiments on two real world datasets show that our algorithm outperforms baseline algorithms in terms of clustering accuracy.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Citation Errors in Scientific Research and Publications: Causes, Consequences, and Remedies
    Agarwal, Ashok
    Arafa, Mohamed
    Avidor-Reiss, Tomer
    Hamoda, Taha Abo-Almagd Abdel-Meguid
    Shah, Rupin
    WORLD JOURNAL OF MENS HEALTH, 2023, 41 (03): : 461 - 465
  • [32] Evaluating scientific impact of publications: combining citation polarity and purpose
    Heng Huang
    Donghua Zhu
    Xuefeng Wang
    Scientometrics, 2022, 127 : 5257 - 5281
  • [33] STATISTICAL RELIABILITY OF COMPARISONS BASED ON THE CITATION IMPACT OF SCIENTIFIC PUBLICATIONS
    SCHUBERT, A
    GLANZEL, W
    SCIENTOMETRICS, 1983, 5 (01) : 59 - 74
  • [34] Evaluating scientific impact of publications: combining citation polarity and purpose
    Huang, Heng
    Zhu, Donghua
    Wang, Xuefeng
    SCIENTOMETRICS, 2022, 127 (09) : 5257 - 5281
  • [35] Deriving the impact of scientific publications by mining citation opinion terms
    Stamou, Sofia
    Mpouloumpasis, Nikos
    Kozanidis, Lefteris
    Journal of Digital Information Management, 2009, 7 (05): : 282 - 288
  • [36] COMPARATIVE STUDY OF CITATION RATES OF SOVIET SCIENTIFIC AND TECHNICAL PUBLICATIONS
    MARKUSOVA, VA
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 1-ORGANIZATSIYA I METODIKA INFORMATSIONNOI RABOTY, 1973, (01): : 27 - 31
  • [37] Citation Analysis of Publications in Editions in Library and Information Sciences
    N. Yu. Beryozkina
    Scientific and Technical Information Processing, 2022, 49 : 166 - 168
  • [38] INFORMATION COMPRESSION AND STANDARDIZATION IN SCIENTIFIC PUBLICATIONS
    BIES, W
    INTERNATIONAL CLASSIFICATION, 1990, 17 (01): : 42 - 43
  • [39] Citation Analysis of Publications in Editions in Library and Information Sciences
    Beryozkina, N. Yu.
    SCIENTIFIC AND TECHNICAL INFORMATION PROCESSING, 2022, 49 (03) : 166 - 168
  • [40] Towards Using Scientific Publications to Automatically Extract Information on Rare Diseases
    Cousyn, Charles
    Bouchard, Kevin
    Gaboury, Sebastien
    Bouchard, Bruno
    MOBILE NETWORKS & APPLICATIONS, 2020, 25 (03): : 953 - 960