A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis

被引:53
|
作者
Saeed-Ul Hassan [1 ]
Safder, Iqra [1 ]
Akram, Anam [1 ]
Kamiran, Faisal [1 ]
机构
[1] Informat Technol Univ, 346-B,Ferozepur Rd, Lahore 54700, Pakistan
关键词
Knowledge flows; Machine learning; Citation context classification; Influential citations; Citation analysis; INFORMATION-SCIENCE; PATENT CITATIONS; INSTITUTIONS; SPECIALTY; DIFFUSION; SPACE; US;
D O I
10.1007/s11192-018-2767-x
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We measure the knowledge flows between countries by analysing publication and citation data, arguing that not all citations are equally important. Therefore, in contrast to existing techniques that utilize absolute citation counts to quantify knowledge flows between different entities, our model employs a citation context analysis technique, using a machine-learning approach to distinguish between important and non-important citations. We use 14 novel features (including context-based, cue words-based and text-based) to train a Support Vector Machine (SVM) and Random Forest classifier on an annotated dataset of 20,527 publications downloaded from the Association for Computational Linguistics anthology (http://allenai.org/data.html). Our machine-learning models outperform existing state-of-the-art citation context approaches, with the SVM model reaching up to 61% and the Random Forest model up to a very encouraging 90% Precision-Recall Area Under the Curve, with 10-fold cross-validation. Finally, we present a case study to explain our deployed method for datasets of PLoS ONE full-text publications in the field of Computer and Information Sciences. Our results show that a significant volume of knowledge flows from the United States, based on important citations, are consumed by the international scientific community. Of the total knowledge flow from China, we find a relatively smaller proportion (only 4.11%) falling into the category of knowledge flow based on important citations, while The Netherlands and Germany show the highest proportions of knowledge flows based on important citations, at 9.06 and 7.35% respectively. Among the institutions, interestingly, the findings show that at the University of Malaya more than 10% of the knowledge produced falls into the category of important. We believe that such analyses are helpful to understand the dynamics of the relevant knowledge flows across nations and institutions.
引用
收藏
页码:973 / 996
页数:24
相关论文
共 50 条
  • [21] A machine-learning approach for a sintering process using a neural network
    Shigaki, I
    Narazaki, H
    PRODUCTION PLANNING & CONTROL, 1999, 10 (08) : 727 - 734
  • [22] A machine-learning approach to negation and speculation detection for sentiment analysis
    Cruz, Noa P.
    Taboada, Maite
    Mitkov, Ruslan
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2016, 67 (09) : 2118 - 2136
  • [23] Analysis and prediction of Indian stock market: a machine-learning approach
    Shilpa Srivastava
    Millie Pant
    Varuna Gupta
    International Journal of System Assurance Engineering and Management, 2023, 14 : 1567 - 1585
  • [24] A Novel Machine-Learning Approach for Segmentation of Tumour Epithelium in Colorectal Cancer
    Abdelsamea, M. M.
    Pitiot, A.
    Grineviciute, R. B.
    Besusparis, J.
    Laurinavicius, A.
    Ilyas, M.
    JOURNAL OF PATHOLOGY, 2018, 246 : S45 - S45
  • [25] Analysis and prediction of Indian stock market: a machine-learning approach
    Srivastava, Shilpa
    Pant, Millie
    Gupta, Varuna
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (04) : 1567 - 1585
  • [26] A co-citation approach to the analysis on the interaction between scientific and technological knowledge
    Chen, Xi
    Mao, Jin
    Li, Gang
    JOURNAL OF INFORMETRICS, 2024, 18 (03)
  • [27] Machine learning in occupational accident analysis: A review using science mapping approach with citation network analysis
    Sarkar, Sobhan
    Maiti, J.
    SAFETY SCIENCE, 2020, 131
  • [28] A MACHINE-LEARNING APPROACH TO MEASURING THE ESCAPE OF IONIZING RADIATION FROM GALAXIES IN THE REIONIZATION EPOCH
    Jensen, Hannes
    Zackrisson, Erik
    Pelckmans, Kristiaan
    Binggeli, Christian
    Ausmees, Kristiina
    Lundholm, Ulrika
    ASTROPHYSICAL JOURNAL, 2016, 827 (01):
  • [29] Knowledge acquisition using a fuzzy machine-learning algorithm for a knowledge-based anesthesia monitor
    van den Eijkel, GC
    Backer, E
    PROCEEDINGS OF THE 18TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 18, PTS 1-5, 1997, 18 : 1997 - 1998
  • [30] Using a Novel Machine-Learning Algorithm as an Auxiliary Approach to Predict the Transfusion Volume in Mitral Valve Surgery
    Sang, Ruirui
    Wu, Qianyi
    Liu, Shun
    Wu, Kai
    Nie, Yining
    Xia, Xingqiu
    Ren, He
    Jiang, Mi
    Tu, Guowei
    Rong, Ruiming
    Wei, Lai
    Zhou, Rong
    HEART SURGERY FORUM, 2024, 27 (06): : E645 - E654