Can we use Linked Data Semantic Annotators for the Extraction of Domain-Relevant Expressions?

被引:0
|
作者
Gagnon, Michel [1 ]
Zouaq, Amal [2 ]
Jean-Louis, Ludovic [1 ]
机构
[1] Ecole Poytech Montreal, Montreal, PQ, Canada
[2] Royal Mil Coll Canada, Kingston, ON, Canada
关键词
Semantic annotation; topic extraction; evaluation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic annotation is the process of identifying expressions in texts and linking them to some semantic structure. In particular, Linked data-based Semantic Annotators are now becoming the new Holy Grail for meaning extraction from unstructured documents. This paper presents an evaluation of the main linked data-based annotators available with a focus on domain topics and named entities. In particular, we compare the ability of each tool to annotate relevant domain expressions in text. The paper also proposes a combination of annotators through voting methods and machine learning. Our results show that some linked-data annotators, especially Alchemy, can be considered as a useful resource for topic extraction. They also show that a substantial increase in recall can be achieved by combining the annotators with a weighted voting scheme. Finally, an interesting result is that by removing Alchemy from the combination, or by combining only the more precise annotators, we get a significant increase in precision, at the cost of a lower recall.
引用
收藏
页码:1239 / 1246
页数:8
相关论文
共 50 条
  • [31] Use of mentholated cigarettes: what can we learn from national data sets?
    Okuyemi, Kolawole S.
    Lawrence, Deirdre
    Hammons, George
    Alexander, Linda A.
    ADDICTION, 2010, 105 : 1 - 4
  • [32] Can We Accurately Characterize Wildlife Resource Use When Telemetry Data Are Imprecise?
    Montgomery, Robert A.
    Roloff, Gary J.
    Hoef, Jay M. Ver
    Millspaugh, Joshua J.
    JOURNAL OF WILDLIFE MANAGEMENT, 2010, 74 (08): : 1917 - 1925
  • [33] Can We Use Big Data Analytics to Leverage Tourism in Rural Tourism Destinations?
    Cunha, Carlos R.
    Morais, Elisabete Paulo
    Martins, Catarina
    VISION 2020: SUSTAINABLE ECONOMIC DEVELOPMENT AND APPLICATION OF INNOVATION MANAGEMENT, 2018, : 145 - 156
  • [34] Data presentation and the use of statistical tests in biomedical journals: can we reach a consensus?
    Evans, Roger G.
    Su, Ding-Feng
    CLINICAL AND EXPERIMENTAL PHARMACOLOGY AND PHYSIOLOGY, 2011, 38 (05): : 285 - 286
  • [35] How can we best use electronic data to find and treat the critically ill?
    Singal, Gaurav
    Currier, Paul
    CRITICAL CARE MEDICINE, 2012, 40 (07) : 2242 - 2243
  • [36] FX market volatility modelling: Can we use low-frequency data?
    Lyocsa, Stefan
    Pihal, Tomas
    Vyrost, Tomas
    FINANCE RESEARCH LETTERS, 2021, 40
  • [37] Automating the extraction of information from a historical text and building a linked data model for the domain of ecology and conservation science
    Nundloll, Vatsala
    Smail, Robert
    Stevens, Carly
    Blair, Gordon
    HELIYON, 2022, 8 (10)
  • [38] Pacemaker lead complications: when is extraction appropriate and what can we learn from published data?
    Bracke, FA
    Meijer, A
    van Gelder, LM
    HEART, 2001, 85 (03) : 254 - 258
  • [39] Can we effective use sym004 in tumours harboring EGFR extracellular domain mutations?
    Ucar, Mahmut
    Berk, Veli
    TRANSLATIONAL CANCER RESEARCH, 2016, 5 : S99 - S100
  • [40] Life Science Ontologies in Literature Retrieval: A Comparison of Linked Data Sets for Use in Semantic Search on a Heterogeneous Corpus
    Mueller, Bernd
    Hagelstein, Alexandra
    Guebitz, Thomas
    KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, 2017, 10180 : 158 - 161