Semi-supervised Approach Based on Co-occurrence Coefficient for Named Entity Recognition on Twitter

被引:0
|
作者
Van Cuong Tran [1 ]
Hwang, Dosam [1 ]
Jung, Jason J. [2 ]
机构
[1] Yeungnam Univ, Dept Comp Engn, Gyongsan, South Korea
[2] Chung Ang Univ, Dept Comp Engn, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The nature characteristics of data in Social Network Services (SNS) are usually short, contain insufficient information, and often are influenced by noise data, thus popular Named Entity Recognition (NER) methods applied for these data could provide wrong results even if they perform well on well-format documents. Most of NER methods are based on supervised learning techniques which often require a large amount of training dataset to train a good classifier. The Conditional Random Fields (CRF) is an example of supervised learning method, which is a statistical modeling method to predict labels for sequences of input samples. Weak point of these method is only perform well on well-format sentences. However the proper sentences are not used frequently in SNS, such as a lot of tweets on Twitter are combinations of independent terms which are implicitly belonged to a context of a certain discussion topic. In this paper, we propose a method to extract named entities from Social Data using a semi-supervised learning method, it is an extension of CRF method which adapts the new challenge with segmentations of data depending on its context rather considering entire dataset. In experiments, The method is applied on a dataset collected from Twitter, which includes 8,624 tweets for training with 1,915 labeled tweets and 1,690 tweets for testing. Our system product a promised result with the F score of the classification result be approximated to 83.9%.
引用
收藏
页码:141 / 146
页数:6
相关论文
共 50 条
  • [1] Named entity recognition: a semi-supervised learning approach
    Sintayehu H.
    Lehal G.S.
    [J]. International Journal of Information Technology, 2021, 13 (4) : 1659 - 1665
  • [2] A Semi-supervised Approach for Maximum Entropy Based Hindi Named Entity Recognition
    Saha, Sujan Kumar
    Mitra, Pabitra
    Sarkar, Sudeshna
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 225 - 230
  • [3] Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings
    Okur, Eda
    Demir, Hakan
    Ozgur, Arzucan
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 549 - 555
  • [4] TwiSNER: Semi-supervised Method for Named Entity Recognition from Text Streams on Twitter
    Van Cuong Tran
    Hwang, Dosam
    Jung, Jason J.
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2016, 22 (06) : 782 - 801
  • [5] A Word Similarity Feature-based Semi-supervised Approach for Named Entity Recognition
    Wang, Ze
    Han, Zhongyang
    Zhao, Jun
    Wang, Wei
    Jin, Feng
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2019, : 136 - 141
  • [6] A Semi-Supervised Algorithm for Indonesian Named Entity Recognition
    Leonandya, Rezka Aufar
    Distiawan, Bayu
    Praptono, Nursidik Heru
    [J]. 2015 3RD INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI 2015), 2015, : 45 - 50
  • [7] Semi-supervised disentangled framework for transferable named entity recognition
    Hao, Zhifeng
    Lv, Di
    Li, Zijian
    Cai, Ruichu
    Wen, Wen
    Xu, Boyan
    [J]. NEURAL NETWORKS, 2021, 135 : 127 - 138
  • [8] A Hybrid Approach of Pattern Extraction and Semi-supervised Learning for Vietnamese Named Entity Recognition
    Vo, Duc-Thuan
    Ock, Cheol-Young
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE - TECHNOLOGIES AND APPLICATIONS, PT I, 2012, 7653 : 83 - 93
  • [9] A Hybrid Approach to Semi-Supervised Named Entity Recognition in Health, Safety and Environment Reports
    Sari, Yunita
    Hassan, M. Fadzil
    Zamin, Norshuhani
    [J]. INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 599 - 602
  • [10] Semi-Supervised Learning Approach for Indonesian Named Entity Recognition (NER) Using Co-Training Algorithm
    Aryoyudanta, Bayu
    Adji, Teguh Bharata
    Llidayah, Lndriana
    [J]. 2016 INTERNATIONAL SEMINAR ON INTELLIGENT TECHNOLOGY AND ITS APPLICATIONS (ISITIA): RECENT TRENDS IN INTELLIGENT COMPUTATIONAL TECHNOLOGIES FOR SUSTAINABLE ENERGY, 2016, : 7 - 11