Uncertainty query sampling strategies for active learning of named entity recognition task

被引:2
|
作者
Agrawal, Ankit [1 ]
Tripathi, Sarsij [2 ]
Vardhan, Manu [1 ]
机构
[1] Natl Inst Technol Raipur, Dept Comp Sci & Engn, Raipur, Chhattisgarh, India
[2] Motilal Nehru Natl Inst Technol Allahabad, Dept Comp Sci & Engn, Prayagraj, Uttar Pradesh, India
来源
关键词
Active learning; named entity recognition; uncertainty query sampling;
D O I
10.3233/IDT-200048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Active learning approach is well known method for labeling huge un-annotated dataset requiring minimal effort and is conducted in a cost efficient way. This approach selects and adds most informative instances to the training set iteratively such that the performance of learner improves with each iteration. Named entity recognition (NER) is a key task for information extraction in which entities present in sequences are labeled with correct class. The traditional query sampling strategies for the active learning only considers the final probability value of the model to select the most informative instances. In this paper, we have proposed a new active learning algorithm based on the hybrid query sampling strategy which also considers the sentence similarity along with the final probability value of the model and compared them with four other well known pool based uncertainty query sampling strategies based active learning approaches for named entity recognition (NER) i.e. least confident sampling, margin of confidence sampling, ratio of confidence sampling and entropy query sampling strategies. The experiments have been performed over three different biomedical NER datasets of different domains and a Spanish language NER dataset. We found that all the above approaches are able to reach to the performance of supervised learning based approach with much less annotated data requirement for training in comparison to that of supervised approach. The proposed active learning algorithm performs well and further reduces the annotation cost in comparison to the other sampling strategies based active algorithm in most of the cases.
引用
收藏
页码:99 / 114
页数:16
相关论文
共 50 条
  • [1] Named Entity Recognition in Query
    Guo, Jiafeng
    Xu, Gu
    Cheng, Xueqi
    Li, Hang
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 267 - 274
  • [2] Multicore based least confidence query sampling strategy to speed up active learning approach for named entity recognition
    Ankit Agrawal
    Sarsij Tripathi
    Manu Vardhan
    [J]. Computing, 2023, 105 : 979 - 997
  • [3] Multicore based least confidence query sampling strategy to speed up active learning approach for named entity recognition
    Agrawal, Ankit
    Tripathi, Sarsij
    Vardhan, Manu
    [J]. COMPUTING, 2023, 105 (05) : 979 - 997
  • [4] Domain Adaptation with Active Learning for Named Entity Recognition
    Sun, Huiyu
    Grishman, Ralph
    Wang, Yingchao
    [J]. CLOUD COMPUTING AND SECURITY, ICCCS 2016, PT II, 2016, 10040 : 611 - 622
  • [5] Adversarial Active Learning for Named Entity Recognition in Cybersecurity
    Li, Tao
    Hu, Yongjin
    Ju, Ankang
    Hu, Zhuoran
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 66 (01): : 407 - 420
  • [6] Active Machine Learning Technique For Named Entity Recognition
    Ekbal, Asif
    Saha, Sriparna
    Singh, Dhirendra
    [J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 180 - 186
  • [7] Active learning approach using a modified least confidence sampling strategy for named entity recognition
    Ankit Agrawal
    Sarsij Tripathi
    Manu Vardhan
    [J]. Progress in Artificial Intelligence, 2021, 10 : 113 - 128
  • [8] Active learning approach using a modified least confidence sampling strategy for named entity recognition
    Agrawal, Ankit
    Tripathi, Sarsij
    Vardhan, Manu
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, 2021, 10 (02) : 113 - 128
  • [9] MTAAL: Multi-Task Adversarial Active Learning for Medical Named Entity Recognition and Normalization
    Zhou, Baohang
    Cai, Xiangrui
    Zhang, Ying
    Guo, Wenya
    Yuan, Xiaojie
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14586 - 14593
  • [10] Named Entity Recognition in an Intranet Query Log
    Sutcliffe, Richard
    White, Kieran
    Kruschwitz, Udo
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : D43 - D49