Multicore based least confidence query sampling strategy to speed up active learning approach for named entity recognition

被引:0
|
作者
Ankit Agrawal
Sarsij Tripathi
Manu Vardhan
机构
[1] National Institute of Technology Raipur,
[2] Motilal Nehru National Institute of Technology Allahabad,undefined
来源
Computing | 2023年 / 105卷
关键词
Least confidence; Active learning; Named entity recognition; Speed up; 68U15; 68U01; 68W10; 68T50;
D O I
暂无
中图分类号
学科分类号
摘要
In the present era, there is a large amount of new data available readily from different sources to collect and store. One of the main problems is to label these new data for various machine learning applications correctly. The active learning approach presents a unique case of machine learning which is widely used to solve the above problem by significantly minimizing the need for labeled data. It aims to select the most appropriate samples from the unlabeled data to be correctly labeled by the oracle and is passed to train the active learner incrementally. There are several different query sampling strategies that exist using which the appropriate samples are selected. One of the main problems with the active learning approach is that it is very time-consuming. So in this research work, a new multi-core-based algorithm is proposed to speed up the active learning approach, which can utilize the complete computational resources present in the system. The experiments have been performed for the problem of named entity recognition which deals with labeling the sequences of words in an unstructured text by classifying them into pre-existing categories. The proposed algorithm is evaluated in terms of both: the performance and execution time over three named entity recognition corpus of distinct biomedical domains. The evaluation results shows considerable improvement in terms of execution time for the proposed active learning algorithm than the existing active learning approach.
引用
收藏
页码:979 / 997
页数:18
相关论文
共 35 条
  • [31] A Deep Learning Based Approach for Biomedical Named Entity Recognition Using Multitasking Transfer Learning with BiLSTM, BERT and CRF
    Pooja H.
    Jagadeesh M.P.P.
    SN Computer Science, 5 (5)
  • [32] GRADIENT-BASED ACTIVE LEARNING QUERY STRATEGY FOR END-TO-END SPEECH RECOGNITION
    Yuan, Yang
    Chung, Soo-Whan
    Kang, Hong-Goo
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2832 - 2836
  • [33] A deep active learning-based and crowdsourcing-assisted solution for named entity recognition in Chinese historical corpora
    Yan, Chengxi
    Tang, Xuemei
    Yang, Hao
    Wang, Jun
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2023, 75 (03) : 455 - 480
  • [34] BERT-Based Transfer-Learning Approach for Nested Named-Entity Recognition Using Joint Labeling
    Agrawal, Ankit
    Tripathi, Sarsij
    Vardhan, Manu
    Sihag, Vikas
    Choudhary, Gaurav
    Dragoni, Nicola
    APPLIED SCIENCES-BASEL, 2022, 12 (03):
  • [35] Named Entity Recognition and Relation Extraction for COVID-19: Explainable Active Learning with Word2vec Embeddings and Transformer-Based BERT Models
    Arguello-Casteleiro, M.
    Maroto, N.
    Wroe, C.
    Torrado, C. Sevillano
    Henson, C.
    Des-Diz, J.
    Fernandez-Prieto, M. J.
    Furmston, T.
    Fernandez, D. Maseda
    Kulshrestha, M.
    Stevens, R.
    Keane, J.
    Peters, S.
    ARTIFICIAL INTELLIGENCE XXXVIII, 2021, 13101 : 158 - 163