Open data and open code for big science of science studies

被引:43
|
作者
Light, Robert P. [1 ]
Polley, David E. [1 ]
Boerner, Katy [1 ]
机构
[1] Indiana Univ, Sch Informat & Comp, Cyberinfrastruct Network Sci Ctr, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
Open data; Visualization software; Big data; Scalability; Workflows;
D O I
10.1007/s11192-014-1238-2
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Historically, science of science (Sci2) studies have been performed by single investigators or small teams. As the size and complexity of data sets and analyses scales up, a "Big Science'' approach (Price, Little science, big science, 1963) is required that exploits the expertise and resources of interdisciplinary teams spanning academic, government, and industry boundaries. Big Sci2 studies utilize "big data'', i.e., large, complex, diverse, longitudinal, and/or distributed datasets that might be owned by different stake-holders. They apply a systems science approach to uncover hidden patterns, bursts of activity, correlations, and laws. They make available open data and open code in support of replication of results, iterative refinement of approaches and tools, and education. This paper introduces a database-tool infrastructure that was designed to support big Sci2 studies. The open access Scholarly Database (http://sdb.cns.iu.edu) provides easy access to 26 million paper, patent, grant, and clinical trial records. The open source Sci2 tool (http://sci2.cns.iu.edu) supports temporal, geospatial, topical, and network studies. The scalability of the infrastructure is examined. Results show that temporal analyses scale linearly with the number of records and file size, while the geospatial algorithm showed quadratic growth. The number of edges rather than nodes determined performance for network based algorithms.
引用
收藏
页码:1535 / 1551
页数:17
相关论文
共 50 条
  • [1] OPEN DATA AND OPEN CODE FOR BIG SCIENCE OF SCIENCE STUDIES
    Light, Robert P.
    Polley, David E.
    Boerner, Katy
    [J]. 14TH INTERNATIONAL SOCIETY OF SCIENTOMETRICS AND INFORMETRICS CONFERENCE (ISSI), 2013, : 1342 - 1356
  • [2] Open data and open code for big science of science studies
    Robert P. Light
    David E. Polley
    Katy Börner
    [J]. Scientometrics, 2014, 101 : 1535 - 1551
  • [3] Automating Open Science for Big Data
    Crosas, Merce
    King, Gary
    Honaker, James
    Sweeney, Latanya
    [J]. ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 2015, 659 (01): : 260 - 273
  • [4] Open code for open science?
    Steve M. Easterbrook
    [J]. Nature Geoscience, 2014, 7 : 779 - 781
  • [5] Open code for open science?
    Easterbrook, Steve M.
    [J]. NATURE GEOSCIENCE, 2014, 7 (11) : 779 - 781
  • [6] Beyond open science: Data, code, and causality
    Wolf, Levi John
    [J]. ENVIRONMENT AND PLANNING B-URBAN ANALYTICS AND CITY SCIENCE, 2023, 50 (09) : 2333 - 2336
  • [7] Big Science, Team Science, and Open Science for Neuroscience
    Koch, Christof
    Jones, Allan
    [J]. NEURON, 2016, 92 (03) : 612 - 616
  • [8] Open Science and Data Science
    Peter Wittenburg
    [J]. Data Intelligence, 2021, 3 (01) : 95 - 105
  • [9] Open Science and Data Science
    Wittenburg, Peter
    [J]. DATA INTELLIGENCE, 2021, 3 (01) : 95 - 105
  • [10] Adding Support for Theory in Open Science Big Data
    Miller, John A.
    Peng, Hao
    Cotterell, Michael E.
    [J]. 2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 251 - 255