BiLSTM-CRF Manipuri NER with Character-Level Word Representation

被引:6
|
作者
Jimmy, Laishram [1 ]
Nongmeikappam, Kishorjit [2 ]
Naskar, Sudip Kumar [3 ]
机构
[1] Manipur Tech Univ, Imphal, Manipur, India
[2] Indian Inst Informat Technol Manipur, Imphal, Manipur, India
[3] Jadavpur Univ, Kolkata, W Bengal, India
关键词
Manipuri; Named entity recognition and classification; LSTM; CRF; Embeddings; Deep neural networks; Recurrent neural networks; NAMED ENTITY RECOGNITION; MODEL;
D O I
10.1007/s13369-022-06933-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Named Entity Recognition and Classification (NER) serves as a foundation for many natural language processing tasks such as question answering, text summarization, news/document clustering and machine translation. Manipuri's early NER systems are based on machine learning approaches and employ handcrafted morphological features and domain-specific rules. The domain-specific rules for Manipuri NER are hard to extract as the language is highly agglutinative, inflectional and falls in the category of low resource language. In recent years, deep learning, empowered by continuous vector representation and semantic composition through non-linear processing, has been employed in the various NER task yielding state-of-the accuracy. In this paper, we propose a Manipuri NER model using Bidirectional Long Short Term Memory (BiLSTM) deep neural network in unison with an embedding technique. The embedding technique is a BiLSTM character-level word representation in conjunction with word embedding, which acts as a feature for the Bi-LSTM NER model. The proposed model also employs a Conditional Random Field (CRF) classifier to capture the dependency among output NER tags. Various Gradient Descent (GD) optimizers for the neural model were experimented with to establish an efficient GD optimizer for accurate NER. The NER model with RMSprop GD optimizer achieved an F-Score measure of approximately 98.19% at learning rate eta = 0.001 and with decay constant of rho = 0.9. Further, while performing an intrinsic evaluation on the word embedding, it is found that the proposed embedding technique as a feature can capture the semantic and syntactic rule of the language with 88.14% average clustering accuracy for all NE classes.
引用
收藏
页码:1715 / 1734
页数:20
相关论文
共 38 条
  • [31] ACOUSTIC-TO-WORD ATTENTION-BASED MODEL COMPLEMENTED WITH CHARACTER-LEVEL CTC-BASED MODEL
    Ueno, Sei
    Inaguma, Hirofumi
    Mimura, Masato
    Kawahara, Tatsuya
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5804 - 5808
  • [32] ArCAR: A Novel Deep Learning Computer-Aided Recognition for Character-Level Arabic Text Representation and Recognition
    Muaad, Abdullah Y.
    Jayappa, Hanumanthappa
    Al-antari, Mugahed A.
    Lee, Sungyoung
    ALGORITHMS, 2021, 14 (07)
  • [33] Multilingual POS tagging by a composite deep architecture based on character-level features and on-the-fly enriched Word Embeddings
    Pota, Marco
    Marulli, Fiammetta
    Esposito, Massimo
    De Pietro, Giuseppe
    Fujita, Hamido
    KNOWLEDGE-BASED SYSTEMS, 2019, 164 : 309 - 323
  • [34] End-to-End Recurrent Neural Network Models for Vietnamese Named Entity Recognition: Word-Level Vs. Character-Level
    Thai-Hoang Pham
    Phuong Le-Hong
    COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 219 - 232
  • [35] Microblog Sentiment Analysis Based on Dynamic Character-Level and Word-Level Features and Multi-Head Self-Attention Pooling
    Yan, Shangyi
    Wang, Jingya
    Song, Zhiqiang
    FUTURE INTERNET, 2022, 14 (08):
  • [36] HDP-CNN: Highway deep pyramid convolution neural network combining word-level and character-level representations for phishing website detection
    Zheng, Faan
    Yan, Qiao
    Leung, Victor C. M.
    Yu, F. Richard
    Ming, Zhong
    COMPUTERS & SECURITY, 2022, 114
  • [37] Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study
    Xiong, Ying
    Chen, Shuai
    Chen, Qingcai
    Yan, Jun
    Tang, Buzhou
    JMIR MEDICAL INFORMATICS, 2020, 8 (12)
  • [38] Combining contextualized word representation and sub-document level analysis through Bi-LSTM plus CRF architecture for clinical de-identification
    Catelli, Rosario
    Casola, Valentina
    De Pietro, Giuseppe
    Fujita, Hamido
    Esposito, Massimo
    KNOWLEDGE-BASED SYSTEMS, 2021, 213