Scientific publications clustering using textual and citation information

被引:2
|
作者
Chikhi, Nacim Fateh [1 ]
机构
[1] Univ Blida 1, Fac Sci, Dept Comp Sci, BP 270 Route Soumaa, Blida 09000, Algeria
关键词
Document clustering; Text mining; Science mapping; RELATEDNESS MEASURES;
D O I
10.1016/j.eswa.2024.123319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scientific publications clustering has attracted much attention, and many different approaches have been proposed. One of the challenges in scientific documents clustering is how to combine citation and textual information to improve clustering quality. In this paper, we explore the use of the von Mises-Fisher distribution for scientific documents clustering. The von Mises-Fisher distribution is particularly well-suited for the analysis of directional data. More precisely, we propose a multi-view version of the mixture of von Mises-Fisher distributions in which one view corresponds to textual information and the other view corresponds to citation information. The hypothesis underlying our approach is that both text and citation data are directional. To estimate the parameters of the proposed model, we use the Expectation-Maximization algorithm along with deterministic annealing to escape poor local maxima solutions. Experiments on two real world datasets show that our algorithm outperforms baseline algorithms in terms of clustering accuracy.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Inconsistent Citation of the Global Seismographic Network in Scientific Publications
    Staats, Molly
    Aderhold, Kasey
    Hafner, Katrin
    Dalton, Colleen
    Flanagan, Megan
    Lau, Harriet
    Simons, Frederik J.
    Vallee, Martin
    Wei, S. Shawn
    Yeck, William
    Frassetto, Andy
    Busby, Robert
    SEISMOLOGICAL RESEARCH LETTERS, 2024, 95 (03) : 1478 - 1485
  • [22] RESEARCH ON THE IMPACT OF OPEN ACCESS ON THE CITATION OF SCIENTIFIC PUBLICATIONS
    Povh, Teja Koler
    Zumer, Maja
    GEODETSKI VESTNIK, 2012, 56 (02) : 325 - 342
  • [23] Anomalous diffusion in the citation time series of scientific publications
    Zamani, Maryam
    Aghion, Erez
    Pollner, Peter
    Vicsek, Tamas
    Kantz, Holger
    JOURNAL OF PHYSICS-COMPLEXITY, 2021, 2 (03):
  • [24] A multidimensional framework for characterizing the citation impact of scientific publications
    Bu, Yi
    Waltman, Ludo
    Huang, Yong
    QUANTITATIVE SCIENCE STUDIES, 2021, 2 (01): : 155 - 183
  • [25] Publications and scientific information on the Web
    Sillion, B
    ACTUALITE CHIMIQUE, 1999, (12): : 2 - 2
  • [26] CITATION OF LITERATURE BY INFORMATION SCIENTISTS IN THEIR OWN PUBLICATIONS
    WINDSOR, DA
    WINDSOR, DM
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1973, 24 (05): : 377 - 381
  • [27] Performance Comparison of Clustering Algorithms on Scientific Publications
    Parlina, Anne
    Ramli, Kalamullah
    ADVANCED SCIENCE LETTERS, 2017, 23 (04) : 3730 - 3732
  • [28] Exploiting Citation Knowledge in Personalised Recommendation of Recent Scientific Publications
    Khadka, Anita
    Cantador, Ivan
    Fernandez, Miriam
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2231 - 2240
  • [29] Opinion mining on author's citation characteristics of scientific publications
    Anupkant, S.
    Kumar, P. V. M. Seravana
    Sateesh, Nayani
    Mahesh, D. Bhanu
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS AND COMPUTATIONAL INTELLIGENCE (ICBDAC), 2017, : 348 - 351
  • [30] Nobel Citation Effects on Scientific Publications: A Case Study in Physics
    Dong, Xianlei
    Lin, Kexin
    Gao, Yunfeng
    Hu, Beibei
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (04)