Named Entities as a Metadata Resource for Indexing and Searching Information

被引:1
|
作者
Izo, Flavio [1 ,2 ]
Oliveira, Elias [1 ]
Badue, Claudine [1 ]
机构
[1] Univ Fed Espirito Santo, Programa Posgrad Informat, Vitoria, ES, Brazil
[2] Inst Fed Espirito Santo, Cachoeiro De Itapemirim, Brazil
关键词
Artificial intelligence; Indexing; Named Entity Recognition; Natural Language Processing; Search engine;
D O I
10.1007/978-3-030-96308-8_78
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is an essential task in Natural Language Processing (NLP). By using NER, it is possible to create associations in a text to recognize real-world entities. The data indexing process is also considered a vital resource, as it makes it easier to find texts in a set of documents. When we analyze a search engine, we aim at the ease of the user's search process. Indexing recognized entities could help the search engine find data with a high semantic index, therefore, more accurate. This study aims to investigate the automatic transformation of annotated entities as indexes for a search engine. The recognition of entities used the hybrid model CRF+LG. Search engines usually work with keyword localization (tokens). However, this research aimed to use a semantic search, as it improves the quality of the results by understanding the user's intention using enricher meta factors besides the keyword. We performed ten experiments using P@{5, 10, and 20} and the search engine with a high semantic index achieved accuracy of 100%, correctly returning all results. The search engine without NER was confused when producing results for person and organization categories, mainly.
引用
收藏
页码:838 / 848
页数:11
相关论文
共 50 条
  • [31] Metadata Catalog service for geographic information resource
    Xu, K
    Liao, HS
    Du, JL
    [J]. DCABES 2004, Proceedings, Vols, 1 and 2, 2004, : 478 - 480
  • [32] EFFECTIVE INFORMATION-SEARCHING STRATEGIES WITHOUT PERFECT INDEXING
    TRITSCHLER, RJ
    [J]. AMERICAN DOCUMENTATION, 1964, 15 (03): : 179 - &
  • [33] Research on Core Metadata for Government Information Resource Catalog
    Qu, Zhenxin
    Tang, Shengqun
    [J]. SEVENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III: UNLOCKING THE FULL POTENTIAL OF GLOBAL TECHNOLOGY, 2008, : 555 - 559
  • [34] Named Entities for Computational Linguistics
    Golikova, Daria M.
    [J]. VOPROSY ONOMASTIKI-PROBLEMS OF ONOMASTICS, 2018, 15 (01): : 207 - 215
  • [35] Handling conjunctions in named entities
    Mazur, Pawel
    Dale, Robert
    [J]. LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 49 - 68
  • [36] Handling conjunctions in named entities
    Dale, Robert
    Mazur, Pawel
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 131 - +
  • [37] Cluster analysis of named entities
    Kozareva, Z
    Silva, J
    Gamallo, P
    Lopes, G
    [J]. INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 429 - 433
  • [38] Processing Named Entities in Text
    McNamee, Paul
    Mayfield, James C.
    Piatko, Christine D.
    [J]. JOHNS HOPKINS APL TECHNICAL DIGEST, 2011, 30 (01): : 31 - 40
  • [39] Identifying Named Entities as they are Typed
    Arora, Ravneet Singh
    Tsai, Chen-Tse
    Preotiuc-Pietro, Daniel
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 976 - 988
  • [40] A Probabilistic Model for Linking Named Entities in Web Text with Heterogeneous Information Networks
    Shen, Wei
    Han, Jiawei
    Wang, Jianyong
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 1199 - 1210