A simple and fast method for Named Entity context extraction from patents

被引:15
|
作者
Puccetti, Giovanni [1 ]
Chiarello, Filippo [2 ]
Fantoni, Gualtiero [3 ]
机构
[1] Scuola Normale Super Pisa, Piazza Cavalieri 7, I-56126 Pisa, Italy
[2] Dept Energy Syst Terr & Construct Engn, Largo Lucio Lazzarino 2, I-56122 Pisa, Italy
[3] Dept Civil & Ind Engn, Largo Lucio Lazzarino 2, I-56122 Pisa, Italy
关键词
Natural Language Processing; Information retrieval; Patents; FRAMEWORK;
D O I
10.1016/j.eswa.2021.115570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The process of extracting relevant technical information from patents or technical literature is as valuable as it is challenging. It deals with highly relevant information extraction from a corpus of documents with particular structure, and a mix of technical and legal jargon. Patents are the wider free source of technical information where homogeneous entities can be found. From a technical perspective the approaches refer to Named Entity Recognition (NER) and make use of Machine Learning techniques for Natural Language Processing (NLP). However, due to the large amount of data, to the complexity of the lexicon, the peculiarity of the structure and the scarcity of the examples to be used to feed the machine learning system, new approaches should be studied. NER methods are increasing their performances in many contexts, but a gap still exists when dealing with technical documentation. The aim of this work is to create an automatic training sets for NER systems by exploiting the nature and structure of patents, an open and massive source of technical documentation. In particular, we focus on collecting the context where users of the invention appear within patents. We then measure to which extent we achieve our goal and discuss how much our method is generalizable to other entities and documents.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Bootstrapping Named Entity Extraction for the Creation of Mobile Services
    Polifroni, Joseph
    Kiss, Imre
    Adler, Mark
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1515 - 1520
  • [32] Named Entity Extraction for Knowledge Graphs: A Literature Overview
    Al-Moslmi, Tareq
    Ocana, Marc Gallofre
    Opdahl, Andreas L.
    Veres, Csaba
    IEEE ACCESS, 2020, 8 : 32862 - 32881
  • [33] END-TO-END NAMED ENTITY AND SEMANTIC CONCEPT EXTRACTION FROM SPEECH
    Ghannay, S.
    Caubriere, A.
    Esteve, Y.
    Camelin, N.
    Simonnet, E.
    Laurent, A.
    Morin, E.
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 692 - 699
  • [34] An Approach to Named Entity Extraction from Historical Documents in Traditional Mongolian Script
    Batjargal, Biligsaikhan
    Khaltarkhuu, Garmaabazar
    Kimura, Fuminori
    Maeda, Akira
    2014 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2014, : 489 - 490
  • [35] Active Learning Technique for Biomedical Named Entity Extraction
    Saha, Sriparna
    Ekbal, Asif
    Verma, Mridula
    Sikdar, Utpal
    Poesio, Massimo
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 835 - 841
  • [36] Information Extraction: Evaluating Named Entity Recognition from Classical Malay Documents
    Sazali, Siti Syakirah
    Rahman, Nurazzah Abdul
    Abu Bakar, Zainab
    2016 THIRD INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2016, : 48 - 53
  • [37] Ontology Extraction from Software Requirements Using Named-Entity Recognition
    Kocerka, Jerzy
    Krzeslak, Michal
    Galuszka, Adam
    ADVANCES IN SCIENCE AND TECHNOLOGY-RESEARCH JOURNAL, 2022, 16 (03) : 207 - 212
  • [38] Japanese Named Entity extraction with redundant morphological analysis
    Asahara, M
    Matsumoto, Y
    HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 8 - 15
  • [39] Named Entity Relation Extraction Based on Multiple Features
    Li, Yeqing
    2015 IEEE 29TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS WAINA 2015, 2015, : 213 - 216
  • [40] Learning pattern rules for Chinese named entity extraction
    Chua, TS
    Liu, JM
    EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 411 - 418