Semantic Relation Extraction from Cultural Heritage Archives

被引:0
|
作者
Buranasing, Watchira [1 ,2 ]
Lilakiataskun, Woraphon [1 ]
机构
[1] Mahanakorn Univ Technol, Fac Informat Sci & Technol, Bangkok 10530, Thailand
[2] Natl Elect & Comp Technol Ctr, Pathum Thani 12120, Thailand
来源
JOURNAL OF WEB ENGINEERING | 2022年 / 21卷 / 04期
关键词
Digital archive; relation extraction; cultural archive; word vector representation; information extraction;
D O I
10.13052/jwe1540-9589.2145
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Digital preservation technologies are now being increasingly adopted by cultural heritage organizations. This cultural heritage data is often disseminated in the form of digital text through a variety of channels such as Wikipedia, cultural heritage archives, etc. To acquire knowledge from digital data, the extraction technique becomes an important part. However, in the case of digital text, which has characteristics such as ambiguity, complex grammar structures such as the Thai language, and others, it makes it more challenging to extract information with a high level of accuracy. We thus propose a method for improving the performance of data extraction techniques based on word features, multiple instance learning, and unseen word mapping. Word features are used to improve the quality of word definition by concatenating parts of speech (POS) and word position is used to establish the accurate definition of a word and convert all of this into a vector. In addition, we use multiple instance learning to solve issues where words do not fully express the meaning of the triple. We also cluster the particular word to find the predicate word by removing words that are irrelevant between the subject and the object. The difficulty of having a new set of words that have never been trained before can be overcome by using unseen word mapping with sub-word and nearest neighbor word mapping. We conducted several experiments on a cultural heritage knowledge graph to show the efficacy of the proposed method. The results demonstrated that our proposed technique outperforms existing models currently utilized in relation to extraction systems. It can achieve excellent accuracy since its precision, recall, and F1 score are 0.89, 0.88, and 0.89, respectively. Furthermore, it also performed well in terms of unseen word prediction, precision, recall, and F1 score, which were 0.81, 0.87, and 0.84, respectively.
引用
收藏
页码:1081 / 1102
页数:22
相关论文
共 50 条
  • [1] The management of intangible cultural heritage archives of art from the perspective of cultural heritage
    Yan, Wenming
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 173 - 174
  • [2] For a semantic of the cultural heritage
    de Abreu Gomes, Ana Lucia
    [J]. REVISTA IBERO-AMERICANA DE CIENCIA DA INFORMACAO, 2016, 9 (02): : 441 - 459
  • [3] Archives and children's cultural heritage
    Sparrman, Anna
    Sjoberg, Johanna
    Hrechaniuk, Yelyzaveta
    Kopsell, Linn
    Isaksson, Karin
    Eriksson, Maria
    Orrmalm, Alex
    Venalainen, Paeivi
    Agren, Ylva
    Coulter, Natalie
    Kjellman, Ulrika
    Aarsand, Pal
    Tesar, Marek
    Sanchez-Eppler, Karen
    Wells, Elizabeth
    [J]. ARCHIVES AND RECORDS-THE JOURNAL OF THE ARCHIVES AND RECORDS ASSOCIATION, 2023,
  • [4] Digital Archives and Cultural Heritage: The Inatheque
    Andolfi, Lea
    [J]. AMERICAN JOURNALISM, 2023, 40 (02) : 258 - 259
  • [5] Cultural heritage and the semantic web
    Benjamins, VR
    Contreras, J
    Blázquez, M
    Dodero, JM
    Garcia, A
    Navas, E
    Hernandez, F
    Wert, C
    [J]. SEMANTIC WEB: RESEARCH AND APPLICATIONS, 2004, 3053 : 433 - 444
  • [6] Finding Parallel Passages in Cultural Heritage Archives
    Harris, Martyn
    Levene, Mark
    Zhang, Dell
    Levene, Dan
    [J]. ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE, 2018, 11 (03): : 1 - 24
  • [7] THE PERSONAL ARCHIVES AND THEIR IMPORTANCE AS DOCUMENTARY AND CULTURAL HERITAGE
    Svicero, Thais Jeronimo
    [J]. HISTORIA E CULTURA, 2013, 2 (01): : 221 - 237
  • [8] Digital cultural heritage standards: from silo to semantic web
    Brenda O’Neill
    Larry Stapleton
    [J]. AI & SOCIETY, 2022, 37 : 891 - 903
  • [9] Digital cultural heritage standards: from silo to semantic web
    O'Neill, Brenda
    Stapleton, Larry
    [J]. AI & SOCIETY, 2022, 37 (03) : 891 - 903
  • [10] Cultural heritage information on the semantic web
    Mavrikas, EC
    Nicoloyannis, N
    Kavaki, E
    [J]. ENGINEERING KNOWLEDGE IN THE AGE OF THE SEMANTIC WEB, PROCEEDINGS, 2004, 3257 : 477 - 478