Separating Named Entities

被引:0
|
作者
Ulipova, Barbora [1 ]
Grac, Marek [1 ]
机构
[1] Masaryk Univ, Fac Arts, Computat Linguist Ctr, CS-60177 Brno, Czech Republic
关键词
text corpus; mutual information; named entities;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we analyze the situation of long sequences of mostly capitalized words which look like a named entity but in fact they consist of several named entities. An example of such phenomena is hokejista ( hockey player) New York Rangers Jaromir Jagr. Without splitting the sequence correctly, we will wrongly assume that the whole capitalized sequence is a name of the hockey player. To find out how the sequence should be split into the correct named entities, we tested several methods. These methods are based on the frequencies of the words they consist of and their n-grams. The method DIFF-2 proposed in this article obtained much better results than MI-score or logDice.
引用
收藏
页码:91 / 96
页数:6
相关论文
共 50 条
  • [1] Handling conjunctions in named entities
    Dale, Robert
    Mazur, Pawel
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 131 - +
  • [2] Handling conjunctions in named entities
    Mazur, Pawel
    Dale, Robert
    [J]. LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 49 - 68
  • [3] Named Entities for Computational Linguistics
    Golikova, Daria M.
    [J]. VOPROSY ONOMASTIKI-PROBLEMS OF ONOMASTICS, 2018, 15 (01): : 207 - 215
  • [4] Cluster analysis of named entities
    Kozareva, Z
    Silva, J
    Gamallo, P
    Lopes, G
    [J]. INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 429 - 433
  • [5] Indexing concepts and/or named entities
    Buizza, Pino
    [J]. JLIS.IT, 2011, 2 (02):
  • [6] Processing Named Entities in Text
    McNamee, Paul
    Mayfield, James C.
    Piatko, Christine D.
    [J]. JOHNS HOPKINS APL TECHNICAL DIGEST, 2011, 30 (01): : 31 - 40
  • [7] Identifying Named Entities as they are Typed
    Arora, Ravneet Singh
    Tsai, Chen-Tse
    Preotiuc-Pietro, Daniel
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 976 - 988
  • [8] Integrating Bilingual Named Entities Lexicon with Conditional Random Fields Model for Arabic Named Entities Recognition
    Hkiri, Emna
    Mallati, Souheyl
    Zrigui, Mounir
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 609 - 614
  • [9] Community relation discovery by named entities
    Zhu, Jian-Han
    Goncalves, Alexandre L.
    Uren, Victoria S.
    Motta, Enrico
    Pacheco, Roberto
    Song, Da-Wei
    Rueger, Stefan
    [J]. PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1966 - +
  • [10] A system for recognition of named entities in Greek
    Boutsis, S
    Demiros, I
    Giouli, V
    Liakata, M
    Papageorgiou, H
    Piperidis, S
    [J]. NATURAL LANGUAGE PROCESSING-NLP 2000, PROCEEDINGS, 2000, 1835 : 424 - 435