Data hosting infrastructure for primary biodiversity data

被引:8
|
作者
Goddard, Anthony [1 ]
Wilson, Nathan [1 ]
Cryer, Phil [2 ]
Yamashita, Grant [3 ]
机构
[1] Woods Hole Marine Biol Lab, Ctr Lib & Informat, Woods Hole, MA 02543 USA
[2] Missouri Bot Garden, Ctr Biodivers Informat CBI, St Louis, MO 63119 USA
[3] Arizona State Univ, Ctr Biol & Soc, Tempe, AZ 85287 USA
来源
BMC BIOINFORMATICS | 2011年 / 12卷
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Optical Character Recognition; Biodiversity Data; Global Biodiversity Information Facility; Data Preservation; Open Archival Information System;
D O I
10.1186/1471-2105-12-S15-S5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Today, an unprecedented volume of primary biodiversity data are being generated worldwide, yet significant amounts of these data have been and will continue to be lost after the conclusion of the projects tasked with collecting them. To get the most value out of these data it is imperative to seek a solution whereby these data are rescued, archived and made available to the biodiversity community. To this end, the biodiversity informatics community requires investment in processes and infrastructure to mitigate data loss and provide solutions for long-term hosting and sharing of biodiversity data. Discussion: We review the current state of biodiversity data hosting and investigate the technological and sociological barriers to proper data management. We further explore the rescuing and re-hosting of legacy data, the state of existing toolsets and propose a future direction for the development of new discovery tools. We also explore the role of data standards and licensing in the context of data hosting and preservation. We provide five recommendations for the biodiversity community that will foster better data preservation and access: (1) encourage the community's use of data standards, (2) promote the public domain licensing of data, (3) establish a community of those involved in data hosting and archival, (4) establish hosting centers for biodiversity data, and (5) develop tools for data discovery. Conclusion: The community's adoption of standards and development of tools to enable data discovery is essential to sustainable data preservation. Furthermore, the increased adoption of open content licensing, the establishment of data hosting infrastructure and the creation of a data hosting and archiving community are all necessary steps towards the community ensuring that data archival policies become standardized.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Data hosting infrastructure for primary biodiversity data
    Anthony Goddard
    Nathan Wilson
    Phil Cryer
    Grant Yamashita
    [J]. BMC Bioinformatics, 12
  • [2] Hiding data and code security for application hosting infrastructure
    Lin, P
    Candan, KS
    Bazzi, R
    Liu, ZC
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2003, 2665 : 388 - 388
  • [3] Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure
    Peter Ingwersen
    Vishwas Chavan
    [J]. BMC Bioinformatics, 12
  • [4] Indicators for the Data Usage Index (DUI): an incentive for publishing primary biodiversity data through global information infrastructure
    Ingwersen, Peter
    Chavan, Vishwas
    [J]. BMC BIOINFORMATICS, 2011, 12 : S3
  • [5] A Data Mining Framework for Primary Biodiversity Data Analysis
    Fontes, Suelane Garcia
    Stanzani, Silvio Luiz
    Pizzigatti Correa, Pedro Luiz
    [J]. NEW CONTRIBUTIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, PT 1, 2015, 353 : 813 - 821
  • [6] Biodiversity informatics:: managing and applying primary biodiversity data
    Soberón, J
    Peterson, AT
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2004, 359 (1444) : 689 - 698
  • [7] Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure
    Amaral, Rafael
    Badia, Rosa M.
    Blanquer, Ignacio
    Braga-Neto, Ricardo
    Candela, Leonardo
    Castelli, Donatella
    Flann, Christina
    De Giovanni, Renato
    Gray, William A.
    Jones, Andrew
    Lezzi, Daniele
    Pagano, Pasquale
    Perez-Canhos, Vanderlei
    Quevedo, Francisco
    Rafanell, Roger
    Rebello, Vinod
    Sousa-Baena, Mariane S.
    Torres, Erik
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (02): : 376 - 394
  • [8] The ABCD of primary biodiversity data access
    Holetschek, J.
    Droege, G.
    Guentsch, A.
    Berendsohn, W. G.
    [J]. PLANT BIOSYSTEMS, 2012, 146 (04): : 771 - 779
  • [9] A framework for publishing primary biodiversity data
    Dave Roberts
    Tom Moritz
    [J]. BMC Bioinformatics, 12
  • [10] PRIMARY BIODIVERSITY DATA RECORDS IN THE PYRENEES
    Arino, Arturo H.
    Otegui, Javier
    Villarroya, Ana
    Perez de Zabalza, Anabel
    [J]. ENVIRONMENTAL ENGINEERING AND MANAGEMENT JOURNAL, 2012, 11 (06): : 1059 - 1075