Named Entity Recognition for Short Text Messages

被引:16
|
作者
Ek, Tobias [1 ]
Kirkegaard, Camilla [1 ]
Jonsson, Hakan [2 ]
Nugues, Pierre [1 ]
机构
[1] Lund Univ, Dept Comp Sci, Box 118, S-22100 Lund, Sweden
[2] Sony Ericsson, S-22188 Lund, Sweden
关键词
Named entity recognition; Short text messages; SMS; Information extraction; Ensemble systems;
D O I
10.1016/j.sbspro.2011.10.596
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes a named entity recognition (NER) system for short text messages (SMS) running on a mobile platform. Most NER systems deal with text that is structured, formal, well written, with a good grammatical structure, and few spelling errors. SMS text messages lack these qualities and have instead a short-handed and mixed language studded with emoticons, which makes NER a challenge on this kind of material. We implemented a system that recognizes named entities from SMSes written in Swedish and that runs on an Android cellular telephone. The entities extracted are locations, names, dates, times, and telephone numbers with the idea that extraction of these entities could be utilized by other applications running on the telephone. We started from a regular expression implementation that we complemented with classifiers using logistic regression. We optimized the recognition so that the incoming text messages could be processed on the telephone with a fast response time. We reached an F-score of 86 for strict matches and 89 for partial matches. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of PACLING Organizing Committee.
引用
收藏
页码:178 / 187
页数:10
相关论文
共 50 条
  • [21] HDCNN-CRF for Biomedical Text Named Entity Recognition
    Gao, Mingyuan
    Wei, Hao
    Chen, Fei
    Qu, Wen
    Lu, Mingyu
    [J]. PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 191 - 194
  • [22] A comprehensive study of named entity recognition in Chinese clinical text
    Lei, Jianbo
    Tang, Buzhou
    Lu, Xueqin
    Gao, Kaihua
    Jiang, Min
    Xu, Hua
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 808 - 814
  • [23] Novelty detection for text documents using named entity recognition
    Ng, Kok Wah
    Tsai, Flora S.
    Chen, Lihui
    Goh, Kiat Chong
    [J]. 2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1663 - +
  • [24] Named Entity Recognition and Normalization in Tweets Towards Text Summarization
    Jabeen, Saima
    Shah, Sajid
    Latif, Asma
    [J]. 2013 EIGHTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2013, : 223 - 227
  • [25] Named Entity Recognition Algorithms Comparison For Judicial Text Data
    Aibek, Kuralbayev
    Bobur, Mukhsimbayev
    Abay, Bekbaganbetov
    Hajiyev, Fuad
    [J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2020), 2020,
  • [26] Clinical named-entity recognition: A short comparison
    Lossio-Ventura, Juan Antonio
    Boussard, Sebastien
    Morzan, Juandiego
    Hernandez-Boussard, Tina
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1548 - 1550
  • [27] Persian Automatic Text Summarization Based on Named Entity Recognition
    Khademi, Mohammad Ebrahim
    Fakhredanesh, Mohammad
    [J]. IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2020,
  • [28] Named Entity Recognition in Vietnamese Text Using Label Propagation
    Huong Thanh Le
    Rathany Chan Sam
    Hoan Cong Nguyen
    Thuy Thanh Nguyen
    [J]. 2013 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2013, : 366 - 370
  • [29] Knowledge-Enhanced Named Entity Disambiguation for Short Text
    Feng, Zhifan
    Wang, Qi
    Jiang, Wenbin
    Lyu, Yajuan
    Zhu, Yong
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 735 - 744
  • [30] BLAC: A Named Entity Recognition Model Incorporating Part-of-Speech Attention in Irregular Short Text
    Zhu, Ming
    Li, Huakang
    Sun, Xiaoyu
    Yang, Zhuo
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (IEEE-RCAR 2020), 2020, : 56 - 61