Corpus-driven Annotation Enrichment

被引:0
|
作者
Kuhr, Felix [1 ]
Witten, Bjarne [1 ]
Moeller, Ralf [1 ]
机构
[1] Univ Lubeck, Inst Informat Syst, Ratzeburgerallee 160, D-23562 Lubeck, Germany
关键词
D O I
10.1109/ICSC.2019.00031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A reference library can be described as a corpus of an individual composition of documents containing related work of research, documents of favorite authors, or proceedings of a conference. Enriching documents with meaningful annotations is beneficial for the performance of applications like semantic search, content aggregation, automated relationship discovery, query answering and information retrieval. Available (semi-) automatic annotation tools ignore the individual composition of documents in corpora by annotating documents with generic named-entity related data. In this paper, we present and unsupervised corpus-driven annotation enrichment approach considering the composition of documents and use an EM-like algorithm to enrich weakly annotated documents with meaningful annotations of related documents from the same corpus.
引用
收藏
页码:138 / 141
页数:4
相关论文
共 50 条