Datasheets for Digital Cultural Heritage Datasets

被引:1
|
作者
Alkemade, Henk [1 ,2 ]
Claeyssens, Steven [3 ]
Colavizza, Giovanni [4 ]
Freire, Nuno [5 ,6 ]
Lehmann, Joerg [7 ]
Neudecker, Clemens [1 ,7 ]
Osti, Giulia [8 ]
Van Strien, Daniel [9 ]
机构
[1] Europeana Network Assoc, EuropeanaTech Community, The Hague, Netherlands
[2] ARARE, Dublin, Ireland
[3] Natl Lib Netherlands, KB, The Hague, Netherlands
[4] Univ Bologna, Dept Class & Italian Philol, Bologna, Italy
[5] NOVA Univ Lisbon, Sch Social Sci & Humanities, Lisbon, Portugal
[6] Europeana Fdn, The Hague, Netherlands
[7] Staatsbibliothek Berlin Berlin State Lib, Berlin, Germany
[8] Univ Coll Dublin, Sch Informat & Commun Studies, Dublin, Ireland
[9] Hugging Face, Glasgow City, Scotland
关键词
datasheets; datasets; digital cultural heritage; model cards; machine learning; GLAM institutions;
D O I
10.5334/johd.124
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Sparked by issues of quality and lack of proper documentation for datasets, the machine learning community has begun developing standardised processes for establishing datasheets for machine learning datasets, with the intent to provide context and information on provenance, purposes, composition, the collection process, recommended uses or societal biases reflected in training datasets. This approach fits well with practices and procedures established in GLAM institutions, such as establishing collections' descriptions. However, digital cultural heritage datasets are marked by specific characteristics. They are often the product of multiple layers of selection; they may have been created for different purposes than establishing a statistical sample according to a specific research question; they change over time and are heterogeneous. Punctuated by a series of recommendations to create datasheets for digital cultural heritage, the paper addresses the scope and characteristics of digital cultural heritage datasets; possible metrics and measures; lessons from concepts similar to datasheets and/or established workflows in the cultural heritage sector. This paper includes a proposal for a datasheet template that has been adapted for use in cultural heritage institutions, and which proposes to incorporate information on the motivation and selection criteria, digitisation pipeline, data provenance, the use of linked open data, and version information.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Digital Heritage: Applying Digital Imaging to Cultural Heritage
    Terras, Melissa
    [J]. LITERARY AND LINGUISTIC COMPUTING, 2008, 23 (02): : 244 - 246
  • [2] Digital heritage: Applying digital imaging to cultural heritage
    Srisa-Ard, Surithong
    [J]. ONLINE INFORMATION REVIEW, 2007, 31 (04) : 540 - 541
  • [4] Digital Cultural Heritage APIs
    Bogdanova, Galina
    Todorov, Todor
    Noev, Nikolay
    [J]. DIGITAL PRESENTATION AND PRESERVATION OF CULTURAL AND SCIENTIFIC HERITAGE, 2019, 9 : 231 - 236
  • [5] Digital Cultural Heritage and the Crowd
    Owens, Trevor
    [J]. CURATOR-THE MUSEUM JOURNAL, 2013, 56 (01) : 121 - 130
  • [6] Digital Cultural Heritage meets Digital Humanities
    Muenster, S.
    Apollonio, F. I.
    Bell, P.
    Kuroczynski, P.
    Di Lenardo, I.
    Rinaudo, F.
    Tamborrino, R.
    [J]. 27TH CIPA INTERNATIONAL SYMPOSIUM: DOCUMENTING THE PAST FOR A BETTER FUTURE, 2019, 42-2 (W15): : 812 - 820
  • [7] Aggregation of cultural heritage datasets through the Web of Data
    Freire, Nuno
    Meijers, Enno
    Voorburg, Rene
    Isaac, Antoine
    [J]. PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC SYSTEMS, 2018, 137 : 120 - 126
  • [8] DIGITAL INNOVATION FOR CULTURAL HERITAGE: LESSONS FROM THE EUROPEAN YEAR OF CULTURAL HERITAGE
    Lykourentzou, Ioanna
    Antoniou, Angeliki
    [J]. SCIRES-IT-SCIENTIFIC RESEARCH AND INFORMATION TECHNOLOGY, 2019, 9 (01): : 91 - 98
  • [9] CREATION AND PRESERVATION OF DIGITAL CULTURAL HERITAGE
    Comes, Radu
    Buna, Zsolt
    Badiu, Ionu.
    [J]. JOURNAL OF ANCIENT HISTORY AND ARCHAEOLOGY, 2014, 1 (02): : 50 - 56
  • [10] Accessibility Testing of Digital Cultural Heritage
    Bogdanova, Galina
    Todorov, Todor
    Noev, Nikolay
    [J]. DIGITAL PRESENTATION AND PRESERVATION OF CULTURAL AND SCIENTIFIC HERITAGE, 2020, 10 : 213 - 218