Semantic Relation Extraction from Cultural Heritage Archives

被引:0
|
作者
Buranasing, Watchira [1 ,2 ]
Lilakiataskun, Woraphon [1 ]
机构
[1] Mahanakorn Univ Technol, Fac Informat Sci & Technol, Bangkok 10530, Thailand
[2] Natl Elect & Comp Technol Ctr, Pathum Thani 12120, Thailand
来源
JOURNAL OF WEB ENGINEERING | 2022年 / 21卷 / 04期
关键词
Digital archive; relation extraction; cultural archive; word vector representation; information extraction;
D O I
10.13052/jwe1540-9589.2145
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Digital preservation technologies are now being increasingly adopted by cultural heritage organizations. This cultural heritage data is often disseminated in the form of digital text through a variety of channels such as Wikipedia, cultural heritage archives, etc. To acquire knowledge from digital data, the extraction technique becomes an important part. However, in the case of digital text, which has characteristics such as ambiguity, complex grammar structures such as the Thai language, and others, it makes it more challenging to extract information with a high level of accuracy. We thus propose a method for improving the performance of data extraction techniques based on word features, multiple instance learning, and unseen word mapping. Word features are used to improve the quality of word definition by concatenating parts of speech (POS) and word position is used to establish the accurate definition of a word and convert all of this into a vector. In addition, we use multiple instance learning to solve issues where words do not fully express the meaning of the triple. We also cluster the particular word to find the predicate word by removing words that are irrelevant between the subject and the object. The difficulty of having a new set of words that have never been trained before can be overcome by using unseen word mapping with sub-word and nearest neighbor word mapping. We conducted several experiments on a cultural heritage knowledge graph to show the efficacy of the proposed method. The results demonstrated that our proposed technique outperforms existing models currently utilized in relation to extraction systems. It can achieve excellent accuracy since its precision, recall, and F1 score are 0.89, 0.88, and 0.89, respectively. Furthermore, it also performed well in terms of unseen word prediction, precision, recall, and F1 score, which were 0.81, 0.87, and 0.84, respectively.
引用
收藏
页码:1081 / 1102
页数:22
相关论文
共 50 条
  • [31] Quicklink: a system for the generation of similarity links in cultural heritage archives
    Gagliardi, Isabella
    Zonta, Bruna
    [J]. JOURNAL OF CULTURAL HERITAGE, 2001, 2 (02) : 155 - 162
  • [32] The Relation Between Cultural Economy and Cultural Industries with Cultural Heritage Management
    Ozdemir, Nebi
    [J]. MILLI FOLKLOR, 2009, (84): : 73 - 86
  • [33] A Multimedia Semantic Recommender System for Cultural Heritage Applications
    Albanese, Massimiliano
    d'Acierno, Antonio
    Moscato, Vincenzo
    Persia, Fabio
    Picariello, Antonio
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 403 - 410
  • [34] When It Comes to Querying Semantic Cultural Heritage Data
    Markhoff, Beatrice
    Nguyen, Thanh Binh
    Niang, Cheikh
    [J]. NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017, 2017, 767 : 384 - 394
  • [35] Editorial: Special issue on Semantic Web for Cultural Heritage
    Bikakis, Antonis
    Hyvonen, Eero
    Jean, Stephane
    Markhoff, Beatrice
    Mosca, Alessandro
    [J]. SEMANTIC WEB, 2021, 12 (02) : 163 - 167
  • [36] Ontology and Semantic Wiki for an Intangible Cultural Heritage Inventory
    Stanley, Renzo
    Astudillo, Hernan
    [J]. PROCEEDINGS OF THE 2013 XXXIX LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2013,
  • [37] Instance-based Semantic Interoperability in the Cultural Heritage
    Wang, Shenghui
    Isaac, Antoine
    Schlobach, Stefan
    van der Meij, Lourens
    Schopman, Balthasar
    [J]. SEMANTIC WEB, 2012, 3 (01) : 45 - 64
  • [38] Exploiting cultural heritage documentation in semantic multimedia annotations
    Ntousias, Alexandros
    Gioldasis, Nektarios
    Tsinaraki, Chrisa
    Christodoulakis, Stavros
    [J]. SECOND INTERNATIONAL WORKSHOP ON SEMANTIC MEDIA ADAPTATION AND PERSONALIZATION, PROCEEDINGS, 2007, : 140 - 146
  • [39] Semantic Web and reasoning for cultural heritage and digital libraries
    Koutsomitropoulos, Dimitrios A.
    Hyvonen, Eero
    Papatheodorou, Theodore S.
    [J]. SEMANTIC WEB, 2012, 3 (01) : 1 - 1
  • [40] Cultural heritage on the Semantic Web: The Europeana Data Model
    Silva, Ana Luisa
    Terra, Ana Lucia
    [J]. IFLA JOURNAL-INTERNATIONAL FEDERATION OF LIBRARY ASSOCIATIONS, 2024, 50 (01): : 93 - 107