Characterizing in-text citations in scientific articles: A large-scale analysis

被引:84
|
作者
Boyack, Kevin W. [1 ]
van Eck, Nees Jan [2 ]
Colavizza, Giovanni [3 ]
Waltman, Ludo [2 ]
机构
[1] SciTech Strategies Inc, Albuquerque, NM 87122 USA
[2] Leiden Univ, Ctr Sci & Technol Studies CWTS, Leiden, Netherlands
[3] Ecole Polytech Fed Lausanne, Digital Humanities Lab, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
In-text citations; Citation position analysis; Field-level analysis; Reference age; Citation counts; REFERENCES; FREQUENCY; PUBLICATIONS; ACCURACY; CONTEXT; COUNTS;
D O I
10.1016/j.joi.2017.11.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We report characteristics of in-text citations in over five million full text articles from two large databases - the PubMed Central Open Access subset and Elsevier journals as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more substantive research questions. As such, we have analyzed in-text citations in several ways and report many findings here. Perhaps most significantly, we find that there are large field-level differences that are reflected in position within the text, citation interval (or reference age), and citation counts of references. In general, the fields of Biomedical and Health Sciences, Life and Earth Sciences, and Physical Sciences and Engineering have similar reference distributions, although they vary in their specifics. The two remaining fields, Mathematics and Computer Science and Social Science and Humanities, have different reference distributions from the other three fields and between themselves. We also show that in all fields the numbers of sentences, references, and in-text mentions per article have increased over time, and that there are field-level and temporal differences in the numbers of in-text mentions per reference. A final finding is that references mentioned only once tend to be much more highly cited than those mentioned multiple times. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:59 / 73
页数:15
相关论文
共 50 条
  • [1] Characterizing in-text citations in scientific articles: A large-scale analysis
    SciTech Strategies, Inc., Albuquerque
    NM, United States
    不详
    不详
    [J]. J. Inf., 1 (59-73):
  • [2] Reference behavior in the full text of scientific articles: A large-scale analysis
    Boyack, Kevin W.
    van Eck, Nees Jan
    Colavizza, Giovanni
    Waltman, Ludo
    [J]. 16TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI 2017), 2017, : 787 - 798
  • [3] Reference behavior in the full text of scientific articles: A large-scale analysis
    Boyack, Kevin W.
    Van Eck, Nees Jan
    Colavizza, Giovanni
    Waltman, And Ludo
    [J]. ISSI 2017 - 16th International Conference on Scientometrics and Informetrics, Conference Proceedings, 2017, : 787 - 798
  • [4] Characterizing In-text Citations using N-gram Distributions
    Bertin, Marc
    Atanassova, Iana
    [J]. PROCEEDINGS OF ISSI 2015 ISTANBUL: 15TH INTERNATIONAL SOCIETY OF SCIENTOMETRICS AND INFORMETRICS CONFERENCE, 2015, : 103 - 104
  • [5] An analysis of in-text citations based on fractional counting
    Pak, Chol Myong
    Wang, Weibin
    Yu, Guang
    [J]. JOURNAL OF INFORMETRICS, 2020, 14 (04)
  • [6] Important citation identification using sentiment analysis of in-text citations
    Aljuaid, Hanan
    Iftikhar, Rimsha
    Ahmad, Shahbaz
    Asif, Muhammad
    Afzal, Muhammad Tanvir
    [J]. TELEMATICS AND INFORMATICS, 2021, 56
  • [7] Big data techniques: Large-scale text analysis for scientific and journalistic research
    Arcila-Calderon, Carlos
    Barbosa-Caro, Eduar
    Cabezuelo-Lorenzo, Francisco
    [J]. PROFESIONAL DE LA INFORMACION, 2016, 25 (04): : 623 - 631
  • [8] unarXive: a large scholarly data set with publications' full-text, annotated in-text citations, and links to metadata
    Saier, Tarek
    Faerber, Michael
    [J]. SCIENTOMETRICS, 2020, 125 (03) : 3085 - 3108
  • [9] Characterizing the highly cited articles: A large-scale bibliometric analysis of the top 1% most cited research
    Dorta-Gonzalez, Pablo
    Santana-Jimenez, Yolanda
    [J]. MALAYSIAN JOURNAL OF LIBRARY & INFORMATION SCIENCE, 2019, 24 (02) : 23 - 39
  • [10] unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata
    Tarek Saier
    Michael Färber
    [J]. Scientometrics, 2020, 125 : 3085 - 3108