Lightweight Named Entity Extraction for Korean Short Message Service Text

被引:6
|
作者
Seon, Choong-Nyoung [1 ]
Yoo, JinHwan [1 ]
Kim, Harksoo [2 ]
Kim, Ji-Hwan [1 ]
Seo, Jungyun [3 ,4 ]
机构
[1] Sogang Univ, Dept Comp Sci & Engn, Seoul 121742, South Korea
[2] Kangwon Natl Univ, Dept Comp & Commun Engn, Chuncheon Si 200701, Gangwon Do, South Korea
[3] Sogang Univ, Dept Comp Sci, Seoul 121742, South Korea
[4] Sogang Univ, Interdisciplinary Program Integrated Biotechnol, Seoul 121742, South Korea
关键词
Named entity (NE) extraction; machine learning (ML); rule-based; lightweight;
D O I
10.3837/tiis.2011.03.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a hybrid method of Machine Learning (ML) algorithm and a rule-based algorithm to implement a lightweight Named Entity (NE) extraction system for Korean SMS text. NE extraction from Korean SMS text is a challenging theme due to the resource limitation on a mobile phone, corruptions in input text, need for extension to include personal information stored in a mobile phone, and sparsity of training data. The proposed hybrid method retaining the advantages of statistical ML and rule-based algorithms provides fully-automated procedures for the combination of ML approaches and their correction rules using a threshold-based soft decision function. The proposed method is applied to Korean SMS texts to extract person's names as well as location names which are key information in personal appointment management system. Our proposed system achieved 80.53% in F-measure in this domain, superior to those of the conventional ML approaches.
引用
收藏
页码:560 / 574
页数:15
相关论文
共 50 条
  • [1] The study on the emotion factor extraction from Korean SMS(Short Message Service) text for robot service
    Choi, Dong Yup
    Park, Jinkyu
    Kim, Yong Jae
    [J]. APPLIED MECHATRONICS AND ANDROID ROBOTICS, 2013, 418 : 52 - +
  • [2] Named Entity Recognition for Short Text Messages
    Ek, Tobias
    Kirkegaard, Camilla
    Jonsson, Hakan
    Nugues, Pierre
    [J]. COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 178 - 187
  • [3] Uncertainty handling in named entity extraction and disambiguation for informal text
    van Keulen, Maurice
    Habib, Mena B.
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8816 : 309 - 328
  • [4] Knowledge-Enhanced Named Entity Disambiguation for Short Text
    Feng, Zhifan
    Wang, Qi
    Jiang, Wenbin
    Lyu, Yajuan
    Zhu, Yong
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 735 - 744
  • [5] A Lightweight Named Entity Recognition Method for Chinese Power Equipment Defect Text
    Jiang, Yifan
    Chen, Jing
    Jiang, Hao
    Miao, Xiren
    [J]. 2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 368 - 372
  • [6] HMM-based Korean named entity recognition for information extraction
    Yun, Bo-Hyun
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 526 - 531
  • [7] Improved named entity translation and bilingual named entity extraction
    Huang, F
    Vogel, S
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 253 - 258
  • [8] Chinese Named Entity Recognition Based on BERT and Lightweight Feature Extraction Model
    Yang, Ruisen
    Gan, Yong
    Zhang, Chenfang
    [J]. INFORMATION, 2022, 13 (11)
  • [9] Evaluation of information retrieval and text mining tools on automatic named entity extraction
    Kumar, Nishant
    De Beer, Jan
    Vanthienen, Jan
    Moens, Marie-Francine
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 666 - 667
  • [10] One Class per Named Entity: Exploiting Unlabeled Text for Named Entity Recognition
    Wong, Yingchuan
    Ng, Hwee Tou
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1763 - 1768