Scientific publications clustering using textual and citation information

被引:2
|
作者
Chikhi, Nacim Fateh [1 ]
机构
[1] Univ Blida 1, Fac Sci, Dept Comp Sci, BP 270 Route Soumaa, Blida 09000, Algeria
关键词
Document clustering; Text mining; Science mapping; RELATEDNESS MEASURES;
D O I
10.1016/j.eswa.2024.123319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scientific publications clustering has attracted much attention, and many different approaches have been proposed. One of the challenges in scientific documents clustering is how to combine citation and textual information to improve clustering quality. In this paper, we explore the use of the von Mises-Fisher distribution for scientific documents clustering. The von Mises-Fisher distribution is particularly well-suited for the analysis of directional data. More precisely, we propose a multi-view version of the mixture of von Mises-Fisher distributions in which one view corresponds to textual information and the other view corresponds to citation information. The hypothesis underlying our approach is that both text and citation data are directional. To estimate the parameters of the proposed model, we use the Expectation-Maximization algorithm along with deterministic annealing to escape poor local maxima solutions. Experiments on two real world datasets show that our algorithm outperforms baseline algorithms in terms of clustering accuracy.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Using Structural Information and Citation Evidence to Detect Significant Plagiarism Cases in Scientific Publications
    Alzahrani, Salha
    Palade, Vasile
    Salim, Naomie
    Abraham, Ajith
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2012, 63 (02): : 286 - 312
  • [2] The Citation Merit of Scientific Publications
    Crespo, Juan A.
    Ortuno-Ortin, Ignacio
    Ruiz-Castillo, Javier
    PLOS ONE, 2012, 7 (11):
  • [3] Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
    Subelj, Lovro
    van Eck, Nees Jan
    Waltman, Ludo
    PLOS ONE, 2016, 11 (04):
  • [4] Citation-based clustering of publications using CitNetExplorer and VOSviewer
    Nees Jan van Eck
    Ludo Waltman
    Scientometrics, 2017, 111 : 1053 - 1070
  • [5] Citation-based clustering of publications using CitNetExplorer and VOSviewer
    van Eck, Nees Jan
    Waltman, Ludo
    SCIENTOMETRICS, 2017, 111 (02) : 1053 - 1070
  • [6] Summarizing Citation Contexts of Scientific Publications
    Mitrovic, Sandra
    Mueller, Henning
    EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, 2015, 9283 : 154 - 165
  • [7] Document clustering of scientific texts using citation contexts
    Aljaber, Bader
    Stokes, Nicola
    Bailey, James
    Pei, Jian
    INFORMATION RETRIEVAL, 2010, 13 (02): : 101 - 131
  • [8] Document clustering of scientific texts using citation contexts
    Bader Aljaber
    Nicola Stokes
    James Bailey
    Jian Pei
    Information Retrieval, 2010, 13 : 101 - 131
  • [9] Research on Scientific Bibliography Clustering Technology Based on Citation Information Merging
    Yin, Baosheng
    Song, Zhiwei
    Yan, Yusheng
    Sun, Mengyang
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 19 - 23
  • [10] Interdisciplinary comparison of scientific impact of publications using the citation-ratio
    Bos A.R.
    Nitza S.
    Data Science Journal, 2019, 18 (01)