Named Entities as a Metadata Resource for Indexing and Searching Information

被引:1
|
作者
Izo, Flavio [1 ,2 ]
Oliveira, Elias [1 ]
Badue, Claudine [1 ]
机构
[1] Univ Fed Espirito Santo, Programa Posgrad Informat, Vitoria, ES, Brazil
[2] Inst Fed Espirito Santo, Cachoeiro De Itapemirim, Brazil
关键词
Artificial intelligence; Indexing; Named Entity Recognition; Natural Language Processing; Search engine;
D O I
10.1007/978-3-030-96308-8_78
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is an essential task in Natural Language Processing (NLP). By using NER, it is possible to create associations in a text to recognize real-world entities. The data indexing process is also considered a vital resource, as it makes it easier to find texts in a set of documents. When we analyze a search engine, we aim at the ease of the user's search process. Indexing recognized entities could help the search engine find data with a high semantic index, therefore, more accurate. This study aims to investigate the automatic transformation of annotated entities as indexes for a search engine. The recognition of entities used the hybrid model CRF+LG. Search engines usually work with keyword localization (tokens). However, this research aimed to use a semantic search, as it improves the quality of the results by understanding the user's intention using enricher meta factors besides the keyword. We performed ten experiments using P@{5, 10, and 20} and the search engine with a high semantic index achieved accuracy of 100%, correctly returning all results. The search engine without NER was confused when producing results for person and organization categories, mainly.
引用
收藏
页码:838 / 848
页数:11
相关论文
共 50 条
  • [41] Research on hybrid indexing strategy of resource metadata in semantic peer-to-peer network
    Liu, Zhen
    Deng, Su
    Luo, Xue-Shan
    Huang, Hong-Bin
    [J]. Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2006, 28 (06): : 95 - 101
  • [42] A proposal for using metadata encoding techniques for health care information indexing on the WWW
    Appleyard, RJ
    Malet, G
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1997, : 905 - 905
  • [43] An evaluation of interfaces for searching a structured information resource
    Purcell, GP
    Detmer, WM
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, : 1004 - 1004
  • [44] Integrating Bilingual Named Entities Lexicon with Conditional Random Fields Model for Arabic Named Entities Recognition
    Hkiri, Emna
    Mallati, Souheyl
    Zrigui, Mounir
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 609 - 614
  • [45] Metadata-based information resource integration for research management
    Chen, Zhilong
    Wu, Dengsheng
    Lu, Jingxiu
    Chen, Yuanping
    [J]. FIRST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2013, 17 : 54 - 61
  • [46] NORMATIVE DESCRIPTION OF RAILWAY VIDEO RESOURCE INFORMATION BASED ON METADATA
    Zhou Huijuan
    Jia Limin
    Qin Yong
    [J]. PROCEEDINGS OF 2009 2ND IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY, 2009, : 199 - +
  • [47] The Effects of High Quality Translations of Named Entities in Cross-Language Information Exploration
    Dan Wu
    He, Daqing
    Heng Ji
    Grishman, Ralph
    [J]. IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 443 - +
  • [48] Towards a Protein-Protein Interaction information extraction system: Recognizing named entities
    Danger, Roxana
    Pla, Ferran
    Molina, Antonio
    Rosso, Paolo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 57 : 104 - 118
  • [49] Demonstration: Bringing lives to light: Browsing and searching biographical information with a metadata infrastructure
    Larson, Ray R.
    [J]. Research and Advanced Technology for Digital Libraries, Proceedings, 2007, 4675 : 539 - 542
  • [50] Named Entities in Court: The MarineLives Corpus
    Ritze, Dominique
    Zirn, Caecilia
    Greenstreet, Colin
    Eckert, Kai
    Ponzetto, Simone Paolo
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,